In our application we get xml data as input from an external application. When this xml data arrives at our end we run through JDOM api to read the same. We do not validate against any DTD or Schema because that is not possible in this scenario. The 3rd party app in our scenario cannot modify or tweak the xml as per requirements & submit it to us. We have to live with whatever xml the third party app provides & tweak our code to incorporate the same.
So now when we started processing that xml through JDOM api we get the following error. ---------------- The content of elements must consist of well-formed character data or markup. -----------------
The problem is the xml generated by the 3rd party app is not well formed. Some of the xml tag names have spaces in between them. Something like this < id>abc123</id> and <parent-ref-id>111zzz</parent- ref-id>
Since we cannot make the third party change the way it generates xml. We want to impose the well formedness in the xml we receive befor we start processing the same using JDOM. The error scenarios are always the trailing spaces inside tag names like you can see in the id open tag above & parent-ref-id closing tag mentioned above.
Do let me know your suggestions on how do we make a generic routine to handle the same & superimpose wellformedness on the Xml Document.
Joined: Mar 22, 2005
We don't you simply remove any whitespace immediately following a "<" and immediately preceding a ">" (if they are not inside a CDATA section, of course)?
The only really acceptable solution is to generate correct XML, though. You should work with the source of the files on that.
Author and all-around good cowpoke
Joined: Mar 22, 2000
You could create a custom subclass of FilterInputStream to process the XML before the parser gets it. This custom class could just discard spaces inside the tags. That is the general way the java.io package handles modifying byte input streams.