In HTML5, the <svg>-tag is phrasing content according to the specification. However, HTMLCleaner doesn't include it in the definition of PHRASING_TAGS and adds CLOSE_BEFORE_TAGS to svg which means that for example the following HTML code
<p><svg xmlns="http://www.w3.org/2000/svg" version="1.1">
<circle cx="100" cy="50" fill="red" r="40" stroke="black" stroke-width="2"></circle>
</svg></p>
is incorrectly cleaned as
<p></p><svg xmlns="http://www.w3.org/2000/svg" version="1.1">
<circle cx="100" cy="50" fill="red" r="40" stroke="black" stroke-width="2"></circle>
</svg>
Hi @scottwilson. Long time no speak! How are you?
We have several issues reported quite a while ago, like this one (also reported at https://sourceforge.net/p/htmlcleaner/bugs/231/) or https://sourceforge.net/p/htmlcleaner/bugs/230/ and are wondering if we could expect some fixes.
Anything we could do to help out?
Thank you very much, you've always been very helpful to the XWiki project.
-Vincent
Similar bugs also affect the math, embed, img, data, object, picture, video, iframe, and q tags. All of them are phrasing content in the current HTML standard but not allowed as phrasing content in HtmlCleaner (at least not as children of tags like strong). See https://github.com/xwiki/xwiki-commons/blob/ce23e117d1cd1515250855eab9bcd7226e66a72f/xwiki-commons-core/xwiki-commons-xml/src/main/java/org/xwiki/xml/internal/html/XWikiHTML5TagProvider.java how we currently modify the HTML 5 tag definitions in XWiki to work around these bugs.
Hi both!
Always happy to help. I'll take a look at these next week. If it's just a case of tweaking the tag provider and going through unit tests that shouldn't take much work. I was away from the project for a while (lots happening in the day job) and then found it hard to get back into it again as new versions of Eclipse were making things rather difficult.
Since I moved over to IntelliJ I'm feeling a lot more productive, and that I can now dip back into HC and start fixing things again with a better tempo!
228 is now fixed and will be in release version 2.28. Odd coincidence!
@scottwilson it seems you forgot to close that one: I can see a commit related to it before release 2.28, see https://sourceforge.net/p/htmlcleaner/code/595/