File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes XML and Related Technologies and the fly likes Transformer outputs wrong encoding Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Transformer outputs wrong encoding" Watch "Transformer outputs wrong encoding" New topic
Author

Transformer outputs wrong encoding

Milan Tomic
Greenhorn

Joined: Feb 19, 2007
Posts: 9
I have this case:



and this line:

System.out.println("ret: " + ret);

prints XML with "utf-8" encoding:



Why? It should be us-ascii.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41812
    
  62
While I don't have an answer to that, two remarks:

1) Check that the document really only contains characters in the US-ASCII range. I could imagine it falling back to UTF-8 if there are non-ASCII characters in the document.

2) Note that UTF-8 is identical to US-ASCII for the entire range of characters contained in US-ASCII.


Ping & DNS - my free Android networking tools app
Milan Tomic
Greenhorn

Joined: Feb 19, 2007
Posts: 9
Ulf Dittmer wrote:While I don't have an answer to that, two remarks:

1) Check that the document really only contains characters in the US-ASCII range. I could imagine it falling back to UTF-8 if there are non-ASCII characters in the document.

2) Note that UTF-8 is identical to US-ASCII for the entire range of characters contained in US-ASCII.


Thank you very much for quick reply

Well, I forgot to mention that when I run this code in Eclipse it produces "us-ascii" XML, but when I run the same code and same XML from command line (on the same PC) it produces "utf-8" XML. Does it help?

Yes, I do have special characters in my XML but I expect Transformer to transform them into us-ascii as & # 353 ; and so on.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41812
    
  62
Yes, I do have special characters in my XML but I expect Transformer to transform them into us-ascii as & # 353 ; and so on.

I am fairly certain that that is an unreasonable expectation, as I doubt that that's the kind of transformation XSLT can do.
Milan Tomic
Greenhorn

Joined: Feb 19, 2007
Posts: 9
New details

When I have Xalan 2.7.0 JAR in my class path then output XML is encoded in us-ascii as I wanted.

When I don't have Xalan in my class path then this line is ignored:

transformer.setOutputProperty(OutputKeys.ENCODING, "us-ascii");

and output XML is encoded in UTF-8. Any idea how to encode output XML in us-ascii without external Xalan JAR on my class path? I am using JRE7 and it should have Xalan inside already.
g tsuji
Ranch Hand

Joined: Jan 18, 2011
Posts: 512
    
    3
Simply do _not_ rely on the xalan bundled with the jdk/jre. It is not only buggy and the people behind the successive releases have succeded in doing nothing about it without punishment. It is being rather ridiculous. If you want to do any xslt with xalan and rely on some more than trivial functionality of xalan's implementation on xslt, always use the updated Apache (or else) downloadable version(s).
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Transformer outputs wrong encoding