Problem in reading a multilengual XML file into database
Joined: Mar 30, 2005
Hi Its two days which im in fight with a severe problem in XML/JAVA/JDBC arena I have an XML file its a glossary which has some thousands of difinition. each difinition is a word with two parts , one of this parts is hebrew and the other is english . (hebrew is explanation of english term).
The XML file is unicode , i tried to read it with XMLdocument and store it into database , but a big problem araise here
All hebrew character replaced with ? in database ? does any one know why ? I tried almost all OS database like Mysql (i used useunicode parameter for it) postgreSQL , Firebird and cloudscape But still what i get is not more that ??? . Shoud i do some convertion before inserting them into database ?
In brief : My problem is reading an XML file with multilanguage content and inserting its data into database , in a manner that my hebrew does not convert to ???
I asked it in XML forum and it seems that there is no problem in reading xml (?) --doesnt java convert my utf data to other encoding?--
What client character set are you using? Presuming you have set up Oracle to store strings as Unicode, then the Unicode values for Hebrew character should be stored fine. When you read them back via SQL*Plus though, unless you have indicated which NLS environment you want to use, your Hebrew characters will just be being rendered using whatever the default client character set is.
Hi Thank you for response I think My client charseet is correct because i can see the XML file content true when i open it in Eclipse / Wordpade . But when the data is stored in Database and i retriev them , all i get is ???
By "client charater set" I mean what client character set is Oracle configured to use? Check the NLS_LANG parameter (which can be set as an environment variable or in the Registry) to see what charset it is using. This is the parameter which governs what charset all client connection to the database server use - so your Windows environment might be using one charset (or "code page", since its Windows), but client connections to the server might be using another.
subject: Problem in reading a multilengual XML file into database