Win a copy of Design for the Mind this week in the Design forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

special characters in xml file

 
Bhasker Reddy
Ranch Hand
Posts: 176
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
These spanish characters are causing my xml parser to crash
<TEXT>�Prefieres tus facturas en espa�ol? Llama al 1-866-xxxxx</TEXT>
<TEXT>para m�s detalles.</TEXT>.
giving me grief
org.xml.sax.SAXParseException: Character conversion error: "Unconvertible UTF-8 character beginning with 0xbf" (line number may be too low).
at org.apache.crimson.parser.InputEntity.fatal(InputEntity.java:1100)
at org.apache.crimson.parser.InputEntity.fillbuf(InputEntity.java:1072)
at org.apache.crimson.parser.InputEntity.isXmlDeclOrTextDeclPrefix(InputEntity.java:914)
at org.apache.crimson.parser.Parser2.maybeXmlDecl(Parser2.java:1048)
at org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:520)
at org.apache.crimson.parser.Parser2.parse(Parser2.java:318)
at org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java:442)
at org.apache.crimson.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:185)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:151)
at edocs.xcc.parser.implementation.ParserImp.parse(ParserImp.java:52)
at com.nortel.b2b.dps.parser.xmlparser.planCreator.create(planCreator.java:31)
at com.nortel.b2b.dps.parser.xmlparser.ObjectAgent.run(ObjectAgent.java:60)

Is there a way i can convert these special characters to ? or something like
that
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13058
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I just went through something similar. Apparently it was caused by the way I was setting up the parser input causing the wrong character conversion to be applied. I had been just feeding the parse method the "ins" InputStream from opening a FileInputStream, which caused an exception similar to yours. Using the following code, which creates a Reader and specifies UTF-8 encoding, worked ok.


... etc etc catching various parse exceptions and checking the line number from the LineNumberReader.
Bill
 
Bhasker Reddy
Ranch Hand
Posts: 176
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Do you guys know any good books for SAX PARSER or any good resouces on internet. Please let me know. I have to start working on converting a Parsing and converting xml file to a preprocessed text file.
--Thanks for your help
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13058
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, as far as I know, the gold standard for books is still Harold's "Processing XML with Java". There are plenty of free tutorials on the web, doing a Google search for "xml java tutorial" found the Sun tutorial and the entire contents of Harold's book.
Bill
 
Bhasker Reddy
Ranch Hand
Posts: 176
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
My parse method only takes a fileName and the parser. How do I convert inputstreamREader into a File?
 
Lasse Koskela
author
Sheriff
Posts: 11962
5
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Bhasker Reddy:
My parse method only takes a fileName and the parser. How do I convert inputstreamREader into a File?

You don't. Except if you're willing to read the InputStreamReader's contents into a file...

The problem is apparently inside the parsing method so that's what you should be fixing (i.e. that's where you should use the InputStreamReader).
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic