I have a problem about JEditorPane html source. When I read html source to my editor it changes html source. How can make it unchangable or what is the reason of problem?
There is parsing problem style='font-size:11.0pt;font-family:"Palatino Linotype"' changes to font face="Palatino Linotype" size="11.0pt".
HTMLParser creates appropriate Document structure from the html source string. It measn kind of Tree of Elements. Some attributes are handled on leaves of the tree (e.g. the text attributes). But when html is written back HTMLWriter uses the structure of Elements and provides representation of the structure. I guess if you reopen the getText() result html the view will be the same.
Why they do this.
HTMLEditorKit supports editing so the initial structure can be changed. Storing source text of structure and reflecting changes in the source text is not possible (or let's better say complicated). So they just ignore the original text.
The same happens when you add some extra "\n" between tags. Extra "\n" after the paragraph end has no visual representation in the result view so it's skipped.
Hope this helps.
I’ve looked at a lot of different solutions, and in my humble opinion Aspose is the way to go. Here’s the link: http://aspose.com
subject: JEditorPane HTML parsing problem with about CSS