Win a copy of Design for the Mind this week in the Design forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

XML -> SAX -> MYSQL conversion losing character encoding...

 
Ezra Simon
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I have a big UTF-8 xml file that contains english, french, german, and japanese text:

<?xml version="1.0" encoding="UTF-8"?>

I parse through it with a sax parser in a standard way:

SAXParser parser = new SAXParser();
parser.parse(xmlFile);

and it get inserted into an mySQL database via a prepared statement:

pstmt = con.prepareStatement("INSERT INTO...

at some point the japanese text loses it encoding and end up in the database as a bunch of question marks "???". Stangly though, the non-english, french and german characters are fine.

I am pretty sure it loses the encoding between XML and Java (not Java and mySQL) becuase when I try printing to an HTML page before going to the DB, the smae problem occurs.

Any ideas? Do I maybe need to explicity set the encoding of the inputSource?

thanks for any help,

E.
[ January 23, 2005: Message edited by: Ezra Simon ]
 
Ezra Simon
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Actually - after some further testing this seems to be a mySQL problem - not XML. I will post the specifics, but if anyone has any info it would be helpful.

thanks.
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic