I want to parse an XMl file that i am retrieving from a url,But the issue is that the file does not start with xml declaration as it is supposed to <?xml version="1.0" encoding="utf-8"?> but rather starts of without this,While trying to parse this file i am getting these errors
03-26 01:07:31.181: WARN/System.err(274): at org.apache.harmony.xml.ExpatParser.finish(ExpatParser.java:553)
03-26 01:07:31.181: WARN/System.err(274): at org.apache.harmony.xml.ExpatParser.parseDocument(ExpatParser.java:483)
03-26 01:07:31.181: WARN/System.err(274): at org.apache.harmony.xml.ExpatReader.parse(ExpatReader.java:320)
03-26 01:07:31.181: WARN/System.err(274): at org.apache.harmony.xml.ExpatReader.parse(ExpatReader.java:277)
How to solve this Problem, Any help would be appreciated and I am using SAXParser
Thanks & Regards,
P.s: I also found this error which i failed to put in the time asked this question guys
03-26 10:17:03.018: WARN/System.err(274): org.apache.harmony.xml.ExpatParser$ParseException: At line 2, column 0: no element found
It's perfectly legitimate to have an XML document without a prolog. And given what you have posted, I don't see any evidence at all to point to that being your problem. I would suggest parsing the document with something which produces better error messages, so you can determine the actual problem.
If somebody is sending you an XML document with spaces at the beginning, then they are sending you a document which isn't well-formed. In other words, it isn't XML. Tell them to send you well-formed documents in the future if they expect you to process them.
When first created, your custom class would read the input up to the first < character, then let subsequent characters be read by the parser.
I just realized that for this simple problem it would be simpler to use the existing classes PushbackInputStream or PushbackInputReader to read up to the first <, then let the parser handle the rest.
However i still am getting the same error, am i doing this in the manner you guys suggested or am i doing this wrong or incorrectly, However the error remains the same, I am posting it for your reference.Please take a look guys
03-28 18:39:56.149: WARN/System.err(5439): org.apache.harmony.xml.ExpatParser$ParseException: At line 1, column 0: syntax error
Thanks & Regards,
Author and all-around good cowpoke
Joined: Mar 22, 2000
No! that is the original stream, you want the modified stream which has read past the junk.
You read through the stream until you have read a "<" character, and then you pass the rest of the document to the parser. To me it's pretty clear why that's wrong, so perhaps you just haven't taken the time to think about it.
 If I take the listed xml as shown in 3:29:44 post at its face value, I would be surprised the weather forecast site's service would serve the document without a doctype defining the entity &_deg; (no underscore) and with blanks before the root element aws:weather. But, suppose it really happen. In that case, the way to salvage it is to supply your own entity definition to it.
 And then, the SAXParserFactory should set NamespaceAware to true so that the content handler could popular correctly local name, in case the handler makes specific use of it.
 I would suggest something of this kind so that you can test it out properly. (It seems the site cannot post entity literally, so I put a underscore after & which should not be there---watch out.)
Hey Guys Great News the thing started working and is working well now, I dont get why the errors began in the first place but now all things seem to work just great.But thanks for the support tough, learned a great deal about parsing, xml and specially PushBackInputStream a great relief tough kinda gave me sleepless nights.But all's well that ends well