wiki:HtmlParsing

Version 1 (modified by jukka, 12 years ago) (diff)

--

When displaying a large body of text, which may or may not have html-tags, following things must be done:

  • Allow all tags inserted by Kupu to be there
  • parse tex-code: \( ..code.. \) or \begin{equation} ... \end{equation} with tex-parser. Replace with image-tags.
  • parse bracket-links: [objid linkname], replace with a-href:s.
  • If the text is not inside any other tags, tex-code or bracket-links, convert linebreaks to br:s.
  • if there are words longer than 50chars, chop them, unless they are inside "":s or :s.
  • Allow following tags: whitelist=??,??, remove everything else.

What else? Please fill in whitelist. It would be best if we could do this with as few run-throughs as possible. Maybe one run that looks all of the start-tags and end-tags, tex and bracket-links included and then deals with them recursively.