Having some odd issues, and hoping someone can help me out I'm applying a stylesheet to incoming data that contains, among other things, section characters ('�') encoded as character references (eg. & #xa7 ; ) . The output from this stylesheet must be passed to some older Legacy systems which can't handle Unicode chars, so I've set the output encoding on the stylesheets to be ISO-8859-1. When that happens, I get this in the output: '��' When I set the output encoidng to UTF-8 in the stylesheets, I get only the section char, as is appropriate. This is when running it through my application. if I simply call java org.apache.xalan.xslt.Process directly and pass it the stylesheet and the input data, the output is encoded correctly, in ISO-8859-1, without the bogus C2 ('�') character. So, I'm thinking it must be something in the way I'm processing the data? Here are the relevant bits of code:
Also, this is how I am creating the XMLReader:
The reader is being used to tell the parser to ignore the DTD, and to use namespaces. These are the features and their values:
Any suggestions would be appreciated! Thanks, -tim stevens [ April 10, 2003: Message edited by: Tim Stevens ] [ April 10, 2003: Message edited by: Tim Stevens ]
Joined: May 24, 2001
Here's what I've been trying, and so far no successes: *Got rid of the XMLReader *Manually set the encoding on the Transformer *Manually setting the output encoding on a Writer being passed to a stream result *Any and all combinations of above Still no further than I was this morning. Anyone have any suggestions? I'm sure it's something simple I'm missing.
Joined: May 24, 2001
Just because I hate it when these threads die without a solution, I found the problem, in an unexpected source. After a ton of fiddling, I realized that things weren't changing in any way that I might have expected them to. Well, the problem was that a jar someone had sent me to test, that was being included in my classpath, unknowingly had an older version of the class I was trying to fix in it. And, that older version was taking precidence over the one that I was modifying. The problem was simply that the output writer had an encoding specified.
was replaced with
subject: Encoding troubles when applying stylesheets through Xalan