• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Removing prolog from an XML file

 
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have program that uses JDOM to read through and extract information from a XML file. It works pretty fine with normal XML files, however, some files which I recieve from users have a prolog before the root element.
When I use the prgram to on these files I get the following error:

org.xml.sax.SAXParseException: Content is not allowed in prolog.

I know its becoz of the prolog but I cant ask the users to remove it.Can anybody suggest a work around? Or is there some way in which this offending prolog can be removed within my module?

Thanks in advance!!
 
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
That sounds like a job for an input stream filter - a custom class that reads the input file up to the desired legal starting point and then acts like a normal input stream to feed the parser. I dont use JDOM so I cant be more specific.

Bill
 
Prashant Mishra
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Is there any documentation/links that you can point me to?

Thanks!!
 
Prashant Mishra
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I found some resources. I will read through them.

Thanks!!
 
Prashant Mishra
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi William,

Tried out the idea you suggested. Still the same results

Any other ideas will be welcome.

Thanks!!
 
Ranch Hand
Posts: 820
IntelliJ IDE VI Editor Tomcat Server
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
try:

1. read the file into a stringbuffer using



2: write back out to a new file:

[ August 08, 2007: Message edited by: Tim McGuire ]
 
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Prashant Mishra:
I know its becoz of the prolog

This message quite often means there is content before the prolog. Commonly this content is whitespace which you don't notice.

It might help if you looked at the document again. Does the prolog start at the beginning of the first line? If it doesn't, then you have a malformed document. And you do have the right to ask people not to send you malformed documents.
 
Prashant Mishra
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The problem is that these XML files come from an Integration Scenario, where at times the middleware adds some header information before the root element of the file.These headers might have some weird characters, and might not be same for all files.I have to work on these XML files, and hence the need to some how cut of this "not required" information.
 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Then you need to get that middleware fixed. There's no excuse for sending malformed XML, especially if you are a program that's supposed to provide a service of transmitting XML documents.

But if you can't (yes, I know, we live in the real world) then remove everything before the first "<" character.
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic