File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes XML and Related Technologies and the fly likes Escape XML special characters? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Escape XML special characters?" Watch "Escape XML special characters?" New topic
Author

Escape XML special characters?

Minh Nam
Ranch Hand

Joined: Sep 10, 2011
Posts: 57
Hi friends,

How can I escape XML special characters like &, ', >, <... in XML element's body text?
I am using org.w3c.dom.* packages, however I could not found any utility class that does what I need. Escape special characters is common task so I cannot believe it is not part of standard Java XML.

Any idea?


Advanced java topics
g tsuji
Ranch Hand

Joined: Jan 18, 2011
Posts: 513
    
    3
Apache commons...
http://commons.apache.org/lang/api-2.4/org/apache/commons/lang/StringEscapeUtils.html
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41865
    
  63
IMO, a better approach would be to put data that can contain such characters into CDATA sections. One benefit of that would be that it is clear that the data contains "&", and not "&amp;" - otherwise that may not be obvious to someone who processes that XML.


Ping & DNS - my free Android networking tools app
g tsuji
Ranch Hand

Joined: Jan 18, 2011
Posts: 513
    
    3
It is just as well for me too.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18570
    
    8

Just a comment: if you're using the org.w3c.dom packages, then generally speaking you don't need to concern yourself with escaping those characters. For example

is perfectly legitimate and you don't need to escape that ampersand. Escaping only applies when an XML document is serialized to an external format -- i.e. a text file -- not when it is in an in internal format like that. If you stick to using only XML code in the standard API, for example using a Transformer to serialize your DOM, that escaping is taken care for you by that code. (And that's why the standard API doesn't include a method to do the escaping on a string.) You only need to deal with escaping when you are writing serialization code, for example if you're doing something like

If you did that, then you would have to apply the escaping rules to the string in the "name" variable.
Minh Nam
Ranch Hand

Joined: Sep 10, 2011
Posts: 57
Thanks guys for your replies.

Paul Clapham wrote:If you stick to using only XML code in the standard API, for example using a Transformer to serialize your DOM, that escaping is taken care for you by that code.


Actually I am using a Transformer to convert the Document object to String representation of XML. The output String will be passed to a javascript function by the following code:



So I want to escape any characters from the source XML that may break the Javascript.eval() method call.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18570
    
    8

Minh Nam wrote:Actually I am using a Transformer to convert the Document object to String representation of XML. The output String will be passed to a javascript function by the following code:



So I want to escape any characters from the source XML that may break the Javascript.eval() method call.


The XML which was output by the Transformer will have the correct escaping already. Are you asking this question because you actually encountered a problem? The only problem that I can see is that you might have to do Javascript escaping if the source XML contains apostrophes.
Minh Nam
Ranch Hand

Joined: Sep 10, 2011
Posts: 57
Paul Clapham wrote:
Minh Nam wrote:
The XML which was output by the Transformer will have the correct escaping already. Are you asking this question because you actually encountered a problem? The only problem that I can see is that you might have to do Javascript escaping if the source XML contains apostrophes.


Yes, I encountered a problem with the apostrophes, so I may do a simple escaping without using any external libraries.

Thanks a lot for sharing your experience.
 
Consider Paul's rocket mass heater.
 
subject: Escape XML special characters?