Umm... reading that section of the XML recommendation again, it appears that character references (as opposed to characters) are immune from the normalization rules. In which case you would indeed have to replace them yourself. Something like this?
Joined: Apr 16, 2008
Thanks Paul. That is what I found too. I was just hoping that I missed something or that there was some setting somewhere. We are working with very large multi-MB files and I am trying to avoid assigning String manipulation/comparision routines when processing attributes.
Here is a good example of poor XML design I think. Attributes should not have text (sentences) as values. In these cases, text should be element content, not an attribute value. Unfortunately, the XML design is out of our control.
I'm thinking that we might use UNIX Sed/Awk program to read through file and replace these in XML document before sending to Xerces parse routine. Not sure how big an issue this is right now.
subject: Removing Character References from Attributes