• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

BufferedReader and the ] character

 
Ranch Hand
Posts: 275
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello. I am parsing an xml file and when I read (via in.readLine()) the following line:

<tag>value]</tag>

the line breaks at the ], so the String representation for the line is <tag>value]

then the next line becomes the closing tag:

</tag>

I also noticed this does not happen for all instances where a ] appears in the value...i tested an xml file where the value contained ] for five seperate records and when parsing the file with BufferedReader, the above phenomena occured twice, while the line read completely...<tag>value]</tag>...three times.

This has to be built with jvm 1.3, otherwise, i could use the DOM for 1.4 and above to do this. Thank you very much for reading this and for any input or ideas as to why this occurs.
[ January 17, 2007: Message edited by: Tom Griffith ]
 
author and iconoclast
Posts: 24207
46
Mac OS X Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Buffered reader simply doesn't do this; readLine() breaks at line ending character sequences. Are you sure the data doesn't include extra embedded ^M or ^J characters? Can you open the file in a hex editor to check?
 
Tom Griffith
Ranch Hand
Posts: 275
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I thought it was really strange too. The xml is published on the web...here are some actual tags that i copied and pasted from the source...

these are problematic...the line cuts off after the ]...

<page>72 FR 67]</page>
<page>72 FR 98]</page>

these line reads completely...

<page>72 FR 141]</page>
<page>72 FR 143]</page>
<page>72 FR 177]</page>

I'm not sure what a hex reader is but i can find out. I do see a trend though...the bufferedreader appears to cut off all entries with a two digit number before the ]...<page>72 FR 67]...while entries with three digit numbers are read completely as one line...<page>72 FR 143]</page>. Really weird. I'm going to try to copy and paste all these in a local text file and see what a bufferedreader does with it. That will probably help to determine if anything funky is lurking in the published xml.
[ January 17, 2007: Message edited by: Tom Griffith ]
 
Tom Griffith
Ranch Hand
Posts: 275
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi. I copied and pasted the entire published xml to a local text file and bufferedreader had no problems reading every line...it never cuts off after the ] sign on any entry...pretty much as you said. This is weird. I guess i'm going to see if i can find out what a hex reader is and read the xml on the web with it.
 
What I don't understand is how they changed the earth's orbit to fit the metric calendar. Tiny ad:
a bit of art, as a gift, that will fit in a stocking
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic