wood burning stoves 2.0*
The moose likes XML and Related Technologies and the fly likes apostrophe not getting rendered correctly Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "apostrophe not getting rendered correctly" Watch "apostrophe not getting rendered correctly" New topic
Author

apostrophe not getting rendered correctly

Stephen Huey
Ranch Hand

Joined: Jul 15, 2003
Posts: 618
XStream is used to generate XML in our application, but I don't think this problem is specific to XStream. I'm wondering if there's something I can do in Java to make sure that this problem doesn't happen.

We are sending an XML file to another internal application, and I'm having trouble with an apostrophe in one of the XML nodes. It's a simple object and there is no special converter registered for it, and its comments field is just a String. In two UNIX testing environments, we are seeing different behavior, and I don't even know where to begin to look for the cause. At first, I assumed their parser was mishandling something, but upon investigation, we found out that this is what they're seeing in the XML file they receive from us:


XML: <comments>These are Stephen's comments.</comments>


Actually, it's not exactly what you see above--it's actually Stephen followed by an ampersand followed by apos; and then the final s (the end of "Stephen's"). Anyway, this renders correctly as: These are Stephen's comments.


In another environment (that should be the same as far as character encodings are concerned), here's what they see coming from us:

XML: <comments>These are Stephen&#39;s comments.</comments>

"Stephen's" gets incorrectly rendered as Stephen followed by an ampersand followed by #39; followed by the final s.

I don't know how this could be happening if the code is the same. Has anyone seen something like this before? Is there a way I can put in a bit of safety code to make sure this doesn't happen in any environment?

Thanks,
Stephen

[ May 01, 2008: Message edited by: Stephen Huey ]

[ May 01, 2008: Message edited by: Stephen Huey ]
[ May 01, 2008: Message edited by: Stephen Huey ]
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

Originally posted by Stephen Huey:
In another environment (that should be the same as far as character encodings are concerned), here's what they see coming from us:

XML: <comments>These are Stephen&#39;s comments.</comments>

"Stephen's" gets incorrectly rendered as Stephen followed by an ampersand followed by #39; followed by the final s.
I don't know why different versions of the XML are being produced, but what you posted there isn't incorrect and it's equivalent to the other example. (That's a numeric character reference for the apostrophe.) There shouldn't be any complaints about it, except misguided ones from people who are eyeballing the XML or people who are using non-compliant XML parsers.
Stephen Huey
Ranch Hand

Joined: Jul 15, 2003
Posts: 618
I put in the text incorrectly for the second (bad) environement (actually, the browser rendered it differently from the way I typed it in). What we saw in the XML in the 2nd environment was:

"Stephen" followed by the ampersand symbol followed by "amp;" followed by "#39;s"

This gets incorrectly rendered as "Stephen" followed by the ampersand symbol followed by "#39;s"

So, I know that #39 is the correct numeric code, but I'm wondering if what's happening is some sort of halfway conversion. On the following page, the correct numeric code for an apostrophe has the ampersand symbol in front:

http://www.w3.org/MarkUp/html-spec/html-spec_13.html

So if we now for some reason have

[ampersand symbol] plus "amp;" plus "#39;"

then I'm guessing the parser might translate

[ampersand symbol] plus "amp;"

into the ampersand symbol and on that cycle of converting it didn't know what to do with "#39;" since it didn't see an ampersand symbol in front of it. What I don't know is why that would've happened in the first place!

[ May 01, 2008: Message edited by: Stephen Huey ]

[ May 02, 2008: Message edited by: Stephen Huey ]
[ May 02, 2008: Message edited by: Stephen Huey ]
 
wood burning stoves
 
subject: apostrophe not getting rendered correctly
 
Similar Threads
Invalid/Special XML characters such as &, ', "
how to fix the weird characters from displaying in xsl output?
Trouble with JSTL c:out
" and ' quotes representation.
Parsing an XML file using DOM