File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes XML and Related Technologies and the fly likes XSD Validation for character '&' Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "XSD Validation for character Watch "XSD Validation for character New topic

XSD Validation for character '&'

Savita Amin

Joined: Sep 03, 2003
Posts: 1
Can any one tell me how to use replace function in XSD to replace the character '&' by '&'. Here the character '&' appears as a data in my xml file, and because of this my xml processing fails. I am using SAX parser for this.
Frank Carver

Joined: Jan 07, 1999
Posts: 6920
If your XML file contains the character "&" as data outside a CDATA block, it's simply not a valid XML file. The long-term solution is to sort out whatever/whoever is generating this invalid file.
In the real world, of course, we don't always have that option. I suggest you are going to have to pre-process your invalid XML file with a non-XML tool or program first. I doubt that much XML-based software will accept it at all, and if it does, it may just ignore the offending characters.
The end result of the pre-processing should be to ensure that all invalid uses of "&" are replaced by an entity reference "& amp;". The tricky bit is to recognize which are valid uses (where it already forms part of an entity reference) so they don't get inadvertently converted too.
In the general case this is surprisingly hard. However, in your particular case you may know more about your XML data feed which can make it easier in this case. If you only get these rogue characters inside one element, for example, you can restrict your processing to just the content of that element. You may be able to get away with just wrapping the content of that element in a CDATA block, or you may know that whatever generates that content is unaware of XML and will never generate entity references that should not be escaped.
With out a bit more detail of your particular problem, I can't offer any more general help. Has this been useful, though?
[ September 04, 2003: Message edited by: Frank Carver ]

Read about me at ~ Raspberry Alpha Omega ~ Frank's Punchbarrel Blog
I agree. Here's the link:
subject: XSD Validation for character '&'
It's not a secret anymore!