I'm generating a String containing some HTML that contains many accents. When I try to save it to file, the accents and cyilic characters are mangled. I would like to be able to save a file like this one:
Thanks a lot for your answer. I'm new to these encoding problems and I'm lost with it... As my end target is a browser, I've tried to see the file in Firefox as rendered HTML and as source. I get the same hieroglyphs. If i look at it using JEdit, the same characters appear. I guess (?) at least JEdit is using system default encoding (MacRoman). As for Firefox, i guess (???) it should be able to interpret it correctly if the file was really in UTF-8...
As a attempt to find a workaround, I've succesfully converted my messy HTML in XHTML using TagSoup. As XML is mainly UTF-8 encoded, I naively thought 3rd party XML libraries could handle it for me. Special characters look great in Eclipse console but when i save the file using the DOM4J XMLWriter:
org.dom4j.io.OutputFormat format = org.dom4j.io.OutputFormat.createCompactFormat(); format.setEncoding("UTF-8"); format.setNewlines(true); format.setIndentSize(2); format.setTrimText(false);
org.dom4j.io.XMLWriter xmlWriter = new org.dom4j.io.XMLWriter(new FileWriter("/mydir/tagSoup2.html"), format); xmlWriter.write(docXHtml); xmlWriter.flush();
I get ? in place of the accents... I'm on Mac, the file systems encoding isn't UTF-8... i read it could be the problem... Would it help if i ran the code on Windows or Linux?
Thanks [ November 02, 2006: Message edited by: sven coleman ]