This week's giveaway is in the Android forum.
We're giving away four copies of Android Security Essentials Live Lessons and have Godfrey Nolan on-line!
See this thread for details.
The moose likes XML and Related Technologies and the fly likes Escaping out characters Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Android Security Essentials Live Lessons this week in the Android forum!
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Escaping out characters" Watch "Escaping out characters" New topic
Author

Escaping out characters

Dan Grindstaff
Ranch Hand

Joined: Sep 24, 2006
Posts: 138
I am setting an xml element attribute equal to a string value that must be wrapped in "<![CDATA[" + whatever + "]]>" this code renders as <![CDATA[ whatever ]]> inside the tags. How do I escape the "<" and ">" to avoid the < and > results?

Thanks!
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

I don't think I understand the question, because you can't use CDATA in an attribute value. So if you have an attribute value which contains a "<" character, you have to replace that by the string "&lt;". That's if you are writing the XML out yourself, that is. If you're using XML-aware software to serialize the document, though, you don't have to do any escaping, the software will do that for you.

If that wasn't what you were asking, then go ahead and clarify. In particular where you said that some code "must" be wrapped in the CDATA wrapper... what's that about? Perhaps examples would help.

(Also review how escaping actually works. Because it works here in the forum too, so you should type in those characters in their escaped forms here as well.)
Dan Grindstaff
Ranch Hand

Joined: Sep 24, 2006
Posts: 138
Hi My code has several lines similar to:

that keeps showing up in the final xml doc as:

Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

Aha, so it isn't actually an attribute value after all. That simplifies things. But what you posted there looks like an ordinary CDATA section to me. What's the problem with it?
Dan Grindstaff
Ranch Hand

Joined: Sep 24, 2006
Posts: 138
The ranch is translating my symbols for presentation while when I view it in my editor it shows up as the representations of the character.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

Paul Clapham wrote:(Also review how escaping actually works. Because it works here in the forum too, so you should type in those characters in their escaped forms here as well.)
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

So would it be correct for me to assume that what you see in the document is this?

Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

Assuming that my assumption is correct, then I would suggest changing your line of code to this:



I say that because it looks like whatever is converting your "product" object to XML is doing a competent job of it, including escaping characters which need to be escaped (like the < and > characters). Given that, there is no need to use a CDATA section in that element, a plain old text node would work just as well.

Remember that there are no situations in XML where a CDATA section must be used, it's always possible to use a text node with suitable escaping. CDATA is simply a convenience for people who are typing in XML from a keyboard, and for serializing software which doesn't want to deal with proper escaping.

So anything which says that a CDATA section must be used in a particular situation is either over-specifying things (in which case it could be fixed to not say that) or else it's saying that because the output is being fed into a non-compliant XML parser which won't work unless it finds a CDATA section there.
Dan Grindstaff
Ranch Hand

Joined: Sep 24, 2006
Posts: 138
Hi Paul, Thank you for the excellent reply. Unfortunately, I am being forced to include the CDATA text because the company I am developing this feed for requires it in each line so their tool can convert my feed to spreadsheet. I did some more research and found the StringEscapeUtils utility and tried


and


and neither resolve the problem.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

Dan Grindstaff wrote:Unfortunately, I am being forced to include the CDATA text because the company I am developing this feed for requires it in each line so their tool can convert my feed to spreadsheet.


As I suspected, you're being forced to feed a bozo parser. Too bad about that, you wouldn't have the problem if they had a real parser.

Anyway here's the thing. You can't force your serializer to output a CDATA section by faking one like that. It's just going to get escaped as text by your serializer. So what you have to do is to just pass the text (without the CDATA wrapper) to your setExternalId() method. And then you have to get into whatever is actually serializing the data to XML, and convince it to output that data as CDATA.

If you can't do that, then an alternative would be to capture the output (again without the CDATA wrapper) and apply to it an XSLT identity transformation which writes that element as CDATA.
Dan Grindstaff
Ranch Hand

Joined: Sep 24, 2006
Posts: 138
Thank you, Paul I have a meeting today with the client to discuss this.
Dan Grindstaff
Ranch Hand

Joined: Sep 24, 2006
Posts: 138
So I met with the client and they are not budging. I have to find a way to the CDATA code into the xml document. From your last reply I guess I need to figure out where the serialization is happening.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

Dan Grindstaff wrote:So I met with the client and they are not budging.


Not surprising.

I have to find a way to the CDATA code into the xml document. From your last reply I guess I need to figure out where the serialization is happening.


Yes. Or else apply the other possibility which I suggested.
Dan Grindstaff
Ranch Hand

Joined: Sep 24, 2006
Posts: 138
I think the second alternative will work better for me at this point. Could you suggest some links for learning how to implement this? Thanks.
Dan Grindstaff
Ranch Hand

Joined: Sep 24, 2006
Posts: 138
So I found the following post here that seems to address modifying the marshalling process. What I don't understand is that currently, the marshaller writes to a file location as given here and the new code is as follows I need some help figuring out how to get this into a for that writes a file out to a location so I can upload it. Thanks!
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

But the first piece of code already marshals the document to a file. If you want it to write to some other file, just change "PROD_FILE_LOC" to the full path to that other file.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

Actually, if you want to marshall your data into an in-memory XML file, and then transform that into a "fixed" XML file which you write to the "PROD_FILE_LOC" destination, then I would change your proposed code to use a ByteArrayOutputStream rather than a StringWriter. There's no reason to convert from bytes to chars at this point.

And then I would write an XSLT transformation which is just an identity transformation, only it uses



to cause "ExternalId" elements to be output as CDATA sections. You'd set up the input to come from the ByteArrayOutputStream (or actually from its underlying byte array) and the output to go to "PROD_FILE_LOC".
Dan Grindstaff
Ranch Hand

Joined: Sep 24, 2006
Posts: 138
I apologize I think I am confusing things. I have run the code and it prints out correctly to the screen. I just need to figure out a way to get this information printed to a file so I can upload it. Thanks.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

Dan Grindstaff wrote:I just need to figure out a way to get this information printed to a file so I can upload it. Thanks.


That's the code you already have:

 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Escaping out characters
 
Similar Threads
How to declare an attribute of xml element as CDATA type
jasper reports with subreports
MultiDimenaional Array sort
attributes
CSS is not get included in Richfaces while using DOJO