Hi,
I am working on one XML to HTML conversion application using
Java XSL.
We are facing few problems with multiple language handling.
1. How to handle multiple language characters in XML and XSL.
What we tried : We tried to put the encoding value for the XML document based on the content in the XML document.
Problem : - 1) It's not getting displayed properly for Russian characters when we view the XML on the browser.
2) We are getting our data from lotus notes. When the data contains any special characters like &, < in it we are replacing those chars using & <
Above functionality is working properly when the XML content is in English.
When we get the content in Russian some Russian characters are getting replaced by & and the XML is failing to get displayed on the browser. When we remove the code to replace & with & in the content the data with Russian characters is getting displayed properly ( when we view the XML document in browser )
3) In our application we are converting XML to HTML using java XSL API ( Xalan, Xerces )
when we try to convert the XML to HTML which has Russian character we are getting the following error.
caught Exception = [ javax.xml.transform.TransformerException: SAX Exception
at org.apache.xalan.xslt.XSLTEngineImpl.error(XSLTEngineImpl.java(Compiled Code))
at org.apache.xalan.xslt.XSLTEngineImpl.error(XSLTEngineImpl.java(Inlined Compiled Code))
at org.apache.xalan.xslt.XSLTEngineImpl.process(XSLTEngineImpl.java(Compiled Code))
at com.ibm.sales.ssi.apilite.apilitenew.showHTML(apilitenew.java:1332)
at com.ibm.sales.ssi.apilite.apilitenew.getAPILiteHTML(apilitenew.java:878)
at com.ibm.sales.ssi.apilite.apilitenew.doGet(apilitenew.java:569)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:740)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:853)
We tried putting the correct charset in the XSL for the corresponding meta tag. XML has the corresponding encoding parameter for the Russian language.
Our XML has the following encoding parameter for XML windows-1251 when the content is Russian.
<?xml endoging="windows-1251"?>
Our XSL has the following meta tag in it to display the Russian character set on the browser.
<meta http-equiv="Content-Type" content="text/html; charset=windows-1251">
We are using the following java code to convert XML to HTML.
OutputFormat format = new OutputFormat("xhtml", "windows-1251", true);
format.setPreserveSpace(true);
format.setDoctype(
"-//W3C//DTD XHTML 1.0 Strict//EN",
"http://www.ibm.com/data/dtd/v11/ibmxhtml1-transitional.dtd");
format.setOmitXMLDeclaration(true);
SerializerFactory factory = SerializerFactory.getSerializerFactory("xml");
Serializer serializer = factory.makeSerializer(out, format);
serializer.setOutputFormat(format);
XSLTProcessor xsltProc = XSLTProcessorFactory.getProcessor();
xsltProc.process(
new XSLTInputSource(xmlFileName),
new XSLTInputSource(xslFilePath),
new XSLTResultTarget(out));
Please help me with the solution for the above problems.