I have program that uses JDOM to read through and extract information from a XML file. It works pretty fine with normal XML files, however, some files which I recieve from users have a prolog before the root element. When I use the prgram to on these files I get the following error:
org.xml.sax.SAXParseException: Content is not allowed in prolog.
I know its becoz of the prolog but I cant ask the users to remove it.Can anybody suggest a work around? Or is there some way in which this offending prolog can be removed within my module?
That sounds like a job for an input stream filter - a custom class that reads the input file up to the desired legal starting point and then acts like a normal input stream to feed the parser. I dont use JDOM so I cant be more specific.
Originally posted by Prashant Mishra: I know its becoz of the prolog
This message quite often means there is content before the prolog. Commonly this content is whitespace which you don't notice.
It might help if you looked at the document again. Does the prolog start at the beginning of the first line? If it doesn't, then you have a malformed document. And you do have the right to ask people not to send you malformed documents.
The problem is that these XML files come from an Integration Scenario, where at times the middleware adds some header information before the root element of the file.These headers might have some weird characters, and might not be same for all files.I have to work on these XML files, and hence the need to some how cut of this "not required" information.