This week's book giveaway is in the OCMJEA forum.
We're giving away four copies of OCM Java EE 6 Enterprise Architect Exam Guide and have Paul Allen & Joseph Bambara on-line!
See this thread for details.
The moose likes XML and Related Technologies and the fly likes XML -> SAX -> MYSQL conversion losing character encoding... Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCM Java EE 6 Enterprise Architect Exam Guide this week in the OCMJEA forum!
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "XML -> SAX -> MYSQL conversion losing character encoding..." Watch "XML -> SAX -> MYSQL conversion losing character encoding..." New topic
Author

XML -> SAX -> MYSQL conversion losing character encoding...

Ezra Simon
Greenhorn

Joined: Sep 18, 2004
Posts: 9
Hi,

I have a big UTF-8 xml file that contains english, french, german, and japanese text:

<?xml version="1.0" encoding="UTF-8"?>

I parse through it with a sax parser in a standard way:

SAXParser parser = new SAXParser();
parser.parse(xmlFile);

and it get inserted into an mySQL database via a prepared statement:

pstmt = con.prepareStatement("INSERT INTO...

at some point the japanese text loses it encoding and end up in the database as a bunch of question marks "???". Stangly though, the non-english, french and german characters are fine.

I am pretty sure it loses the encoding between XML and Java (not Java and mySQL) becuase when I try printing to an HTML page before going to the DB, the smae problem occurs.

Any ideas? Do I maybe need to explicity set the encoding of the inputSource?

thanks for any help,

E.
[ January 23, 2005: Message edited by: Ezra Simon ]
Ezra Simon
Greenhorn

Joined: Sep 18, 2004
Posts: 9
Actually - after some further testing this seems to be a mySQL problem - not XML. I will post the specifics, but if anyone has any info it would be helpful.

thanks.
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: XML -> SAX -> MYSQL conversion losing character encoding...