aspose file tools*
The moose likes XML and Related Technologies and the fly likes parsing xml with lots of top element Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "parsing xml with lots of top element" Watch "parsing xml with lots of top element" New topic
Author

parsing xml with lots of top element

Hendra Kurniawan
Ranch Hand

Joined: Jan 31, 2011
Posts: 239
I have an XML like this:


it's like several XMLs concatenated into one XML. What's the best way to parse this document. Right now the document producer is still under construction, thus the concatenated XML. It's guaranteed that this is the only problem left (all aspects of the XML is valid). So in the mean time I need to know the practical way to keep parsing the XML. thanks
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18570
    
    8

You call it a "document", but as I'm sure you already found out, it isn't an XML document because a well-formed XML document has a single root element. And what you posted doesn't have a single root element. And as you already found out, XML parsers can't deal with it because it isn't a well-formed XML document.

You could conceivably deal with it by wrapping the whole thing into a root element, maybe by just putting <root> at the beginning and </root> at the end. But then it still has those <?xml ... ?> things in the middle, which look like processing instructions but have a name ("xml") which processing instructions aren't allowed to have, so the parsers will complain about that too.

But you said

Right now the document producer is still under construction


So my suggestion would be to wait until the document producer is working right. Trying to write code which parses ill-formed XML is a waste of time if you know that eventually you won't need it.

Hendra Kurniawan
Ranch Hand

Joined: Jan 31, 2011
Posts: 239
let's say the document producer won't be fixed until next month while XML is used in day to day operation (not crucial, it's a convenience for the users, and you know how users are when they're not "convenient"). Is it possible to parse the documents to become multiple person object? I 'd rather not use the "additional root" technique. and if you ask why I don't fix it myself? the other team is responsible for maintaining it and they're assigned to other job right now, so the big boss tell me to somehow find a workaround. I'm thinking of something like this:

 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: parsing xml with lots of top element