aspose file tools*
The moose likes XML and Related Technologies and the fly likes special characters in xml file Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "special characters in xml file" Watch "special characters in xml file" New topic
Author

special characters in xml file

Bhasker Reddy
Ranch Hand

Joined: Jun 13, 2000
Posts: 176
These spanish characters are causing my xml parser to crash
<TEXT>�Prefieres tus facturas en espa�ol? Llama al 1-866-xxxxx</TEXT>
<TEXT>para m�s detalles.</TEXT>.
giving me grief
org.xml.sax.SAXParseException: Character conversion error: "Unconvertible UTF-8 character beginning with 0xbf" (line number may be too low).
at org.apache.crimson.parser.InputEntity.fatal(InputEntity.java:1100)
at org.apache.crimson.parser.InputEntity.fillbuf(InputEntity.java:1072)
at org.apache.crimson.parser.InputEntity.isXmlDeclOrTextDeclPrefix(InputEntity.java:914)
at org.apache.crimson.parser.Parser2.maybeXmlDecl(Parser2.java:1048)
at org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:520)
at org.apache.crimson.parser.Parser2.parse(Parser2.java:318)
at org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java:442)
at org.apache.crimson.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:185)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:151)
at edocs.xcc.parser.implementation.ParserImp.parse(ParserImp.java:52)
at com.nortel.b2b.dps.parser.xmlparser.planCreator.create(planCreator.java:31)
at com.nortel.b2b.dps.parser.xmlparser.ObjectAgent.run(ObjectAgent.java:60)

Is there a way i can convert these special characters to ? or something like
that


Bhasker Reddy
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12803
    
    5
I just went through something similar. Apparently it was caused by the way I was setting up the parser input causing the wrong character conversion to be applied. I had been just feeding the parse method the "ins" InputStream from opening a FileInputStream, which caused an exception similar to yours. Using the following code, which creates a Reader and specifies UTF-8 encoding, worked ok.


... etc etc catching various parse exceptions and checking the line number from the LineNumberReader.
Bill
Bhasker Reddy
Ranch Hand

Joined: Jun 13, 2000
Posts: 176
Do you guys know any good books for SAX PARSER or any good resouces on internet. Please let me know. I have to start working on converting a Parsing and converting xml file to a preprocessed text file.
--Thanks for your help
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12803
    
    5
Well, as far as I know, the gold standard for books is still Harold's "Processing XML with Java". There are plenty of free tutorials on the web, doing a Google search for "xml java tutorial" found the Sun tutorial and the entire contents of Harold's book.
Bill
Bhasker Reddy
Ranch Hand

Joined: Jun 13, 2000
Posts: 176
My parse method only takes a fileName and the parser. How do I convert inputstreamREader into a File?
Lasse Koskela
author
Sheriff

Joined: Jan 23, 2002
Posts: 11962
    
    5
Originally posted by Bhasker Reddy:
My parse method only takes a fileName and the parser. How do I convert inputstreamREader into a File?

You don't. Except if you're willing to read the InputStreamReader's contents into a file...

The problem is apparently inside the parsing method so that's what you should be fixing (i.e. that's where you should use the InputStreamReader).


Author of Test Driven (2007) and Effective Unit Testing (2013) [Blog] [HowToAskQuestionsOnJavaRanch]
 
Consider Paul's rocket mass heater.
 
subject: special characters in xml file