File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes XML and Related Technologies and the fly likes System generated xml not well formed Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "System generated xml not well formed" Watch "System generated xml not well formed" New topic

System generated xml not well formed

Udayan Kumar
Ranch Hand

Joined: Jan 16, 2007
Posts: 66
Hi All,

In our application we get xml data as input from an external application.
When this xml data arrives at our end we run through JDOM api to read the same.
We do not validate against any DTD or Schema because that is not possible in this scenario. The 3rd party app in our scenario cannot modify or tweak the xml as per requirements & submit it to us.
We have to live with whatever xml the third party app provides & tweak our code to incorporate the same.

So now when we started processing that xml through JDOM api we get the following error.
The content of elements must consist of well-formed character data or markup.

The problem is the xml generated by the 3rd party app is not well formed. Some of the xml tag names have spaces in between them.
Something like this < id>abc123</id> and
<parent-ref-id>111zzz</parent- ref-id>

Since we cannot make the third party change the way it generates xml. We want to impose the well formedness in the xml we receive befor we start processing the same using JDOM.
The error scenarios are always the trailing spaces inside tag names like
you can see in the id open tag above & parent-ref-id closing tag mentioned above.

Do let me know your suggestions on how do we make a generic routine to handle the same & superimpose wellformedness on the Xml Document.

Ulf Dittmer

Joined: Mar 22, 2005
Posts: 42965
We don't you simply remove any whitespace immediately following a "<" and immediately preceding a ">" (if they are not inside a CDATA section, of course)?

The only really acceptable solution is to generate correct XML, though. You should work with the source of the files on that.
William Brogden
Author and all-around good cowpoke

Joined: Mar 22, 2000
Posts: 13035
You could create a custom subclass of FilterInputStream to process the XML before the parser gets it. This custom class could just discard spaces inside the tags. That is the general way the package handles modifying byte input streams.

I agree. Here's the link:
subject: System generated xml not well formed
It's not a secret anymore!