Win a copy of TensorFlow 2.0 in Action this week in the Artificial Intelligence and Machine Learning forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
  • Campbell Ritchie
  • Liutauras Vilda
  • Paul Clapham
  • Bear Bibeault
  • Jeanne Boyarsky
  • Ron McLeod
  • Tim Cooke
  • Devaka Cooray
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Jj Roberts
  • Stephan van Hulst
  • Carey Brown
  • salvin francis
  • Scott Selikoff
  • fred rosenberger

Problems parsing XML with SAX

Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello all,

i'm in a bit of hurry, so there may be quite easy solution for my 'problem' because I really don't have time to RTFM at the moment.

I'm receiving n+1 (the amount varies) XML-messages via socket and have to parse them and create some boring statistics of them.

The XML-message is in following format.

Ok, no problems parsing that but the socket returns n+1 of these messages at the same time, eg:

And this results to exception:

org.xml.sax.SAXParseException: Illegal character at end of document, <.

So the question is, how can I parse multiple XML-messages at the same time or what should I do ?

Thanks for help ..

- John
Posts: 11962
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hmm. If I understood you correctly, there's nothing invalid about the XML document itself, just that the parser chokes on reading the stream at some random point?

If that's the case, I would probably first try to switch the DOM implementation (Xerces, JDK 1.4 default implementation, Saxon, etc.) and see if that would help. If not, I'd probably try to look into how the parser works and see where it chokes.
Author and all-around good cowpoke
Posts: 13078
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Is the parser recognizing that stream as N separate documents? For SAX you should be seeing a startDocument and an endDocument for each <foo> and </foo> but I'm not sure if the parser can even handle the concept of multiple documents in one stream.
If it is not, then maybe you need to provide a root. If you want to go that route, there is a link on my site to an article on combining multiple XML document - look for the "XML Article Published" header - and source code.
If on the other hand, the parser is just choking on a particular character in the wrong place, you can get more out of that SAXException. Here is what I use.

Stinging nettles are edible. But I really want to see you try to eat this tiny ad:
Thread Boost feature
    Bookmark Topic Watch Topic
  • New Topic