• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Parsing an Xml file with No xml Declaration

 
Ranch Hand
Posts: 154
Android Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello guys

I want to parse an XMl file that i am retrieving from a url,But the issue is that the file does not start with xml declaration as it is supposed to <?xml version="1.0" encoding="utf-8"?> but rather starts of without this,While trying to parse this file i am getting these errors


03-26 01:07:31.181: WARN/System.err(274): at org.apache.harmony.xml.ExpatParser.finish(ExpatParser.java:553)

03-26 01:07:31.181: WARN/System.err(274): at org.apache.harmony.xml.ExpatParser.parseDocument(ExpatParser.java:483)

03-26 01:07:31.181: WARN/System.err(274): at org.apache.harmony.xml.ExpatReader.parse(ExpatReader.java:320)

03-26 01:07:31.181: WARN/System.err(274): at org.apache.harmony.xml.ExpatReader.parse(ExpatReader.java:277)


How to solve this Problem, Any help would be appreciated and I am using SAXParser

Thanks & Regards,
Zoheb

P.s: I also found this error which i failed to put in the time asked this question guys
03-26 10:17:03.018: WARN/System.err(274): org.apache.harmony.xml.ExpatParser$ParseException: At line 2, column 0: no element found
 
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It's perfectly legitimate to have an XML document without a prolog. And given what you have posted, I don't see any evidence at all to point to that being your problem. I would suggest parsing the document with something which produces better error messages, so you can determine the actual problem.
 
zoheb hassan
Ranch Hand
Posts: 154
Android Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hey Paul thanks for the reply but can you suggest how i can determine the problem, I found this error too but failed to put it at time of asking this question

03-26 10:17:03.018: WARN/System.err(274): org.apache.harmony.xml.ExpatParser$ParseException: At line 2, column 0: no element found

This might help in determing the problem
 
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I suggest that your input has leading blank lines or spaces before the root element tag.

Bill
 
zoheb hassan
Ranch Hand
Posts: 154
Android Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hey William thanks for the reply but is there a way to handle those blank spaces and yet parse the file succesfully
 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If somebody is sending you an XML document with spaces at the beginning, then they are sending you a document which isn't well-formed. In other words, it isn't XML. Tell them to send you well-formed documents in the future if they expect you to process them.
 
zoheb hassan
Ranch Hand
Posts: 154
Android Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have reported them the same the response was they would look into it, I still am however curious if there exists any possibility to parse such document
 
William Brogden
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Sure!

The way Java IO classes are constructed it is easy to create your own extension of (for example) java.io.FilterReader and let it process the input stream of characters - or extension of java.io.FilterInputStream if you are reading bytes.

When first created, your custom class would read the input up to the first < character, then let subsequent characters be read by the parser.

Bill
I just realized that for this simple problem it would be simpler to use the existing classes PushbackInputStream or PushbackInputReader to read up to the first <, then let the parser handle the rest.

 
zoheb hassan
Ranch Hand
Posts: 154
Android Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello Guys based on your suggestion i tried this,



However i still am getting the same error, am i doing this in the manner you guys suggested or am i doing this wrong or incorrectly, However the error remains the same, I am posting it for your reference.Please take a look guys

03-28 18:39:56.149: WARN/System.err(5439): org.apache.harmony.xml.ExpatParser$ParseException: At line 1, column 0: syntax error

Thanks & Regards,
Zoheb
 
William Brogden
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


No! that is the original stream, you want the modified stream which has read past the junk.



Bill
 
zoheb hassan
Ranch Hand
Posts: 154
Android Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hey Bill i made the changes suggested by you


But still ends up returning the same error i reported in the previous post
I however am thankful for your interest in this problem
 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You read through the stream until you have read a "<" character, and then you pass the rest of the document to the parser. To me it's pretty clear why that's wrong, so perhaps you just haven't taken the time to think about it.

Consider this question: why are you using a PushbackInputStream?
 
zoheb hassan
Ranch Hand
Posts: 154
Android Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I realized where i was goofing up and made the following changes to the code it should work but however i ran into problems again this time different


But the error i get is this

03-29 00:03:41.898: WARN/System.err(307): org.apache.harmony.xml.ExpatParser$ParseException: At line 1, column 0: unbound prefix

The xml is wish to parse is this



I fail to understand what the issue is
 
William Brogden
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Se we got past the syntax error, hurrah. Now for unbound prefix.

A google search for "unbound prefix xml parser" found this forum thread. Which I suspect will lead you to a solution.

Bill
 
zoheb hassan
Ranch Hand
Posts: 154
Android Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Got what unbound prefix means, It means that the prefix is not bound to an namespace. But the document does contain the namespace required


then the file is well formed then why does the parser throw an error
 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Frankly I would try processing that document through another parser. The parser you're using gives really useless error messages. Who knows whether it is even working correctly?
 
Ranch Hand
Posts: 734
7
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
[0] If I take the listed xml as shown in 3:29:44 post at its face value, I would be surprised the weather forecast site's service would serve the document without a doctype defining the entity &_deg; (no underscore) and with blanks before the root element aws:weather. But, suppose it really happen. In that case, the way to salvage it is to supply your own entity definition to it.

[1] And then, the SAXParserFactory should set NamespaceAware to true so that the content handler could popular correctly local name, in case the handler makes specific use of it.

[2] I would suggest something of this kind so that you can test it out properly. (It seems the site cannot post entity literally, so I put a underscore after & which should not be there---watch out.)
 
zoheb hassan
Ranch Hand
Posts: 154
Android Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hey Guys Great News the thing started working and is working well now, I dont get why the errors began in the first place but now all things seem to work just great.But thanks for the support tough, learned a great deal about parsing, xml and specially PushBackInputStream a great relief tough kinda gave me sleepless nights.But all's well that ends well

P.s: I will be back
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic