| Author |
character encoding (unicode to utf-8) conversion problem
|
Yahya Elyasse
Ranch Hand
Joined: Jul 07, 2005
Posts: 510
|
|
I have run into a problem that I can't seem to find a solution to. my users are copying and pasting from MS-Word. My DB is Oracle with its encoding set to "UTF-8". Using Oracle's thin driver it automatically converts to the DB's default character set. When Java tries to encode Unicode to UTF-8 and it runs into an unknown character (typically a character that is in the High Ascii range) it substitutes it with '?' or some other wierd character. How do I prevent this. I tried different encodings using a simple driver like: But that didn't work. Then I tried a more elaborate conversion: I tried a variation of the second code snippet that inserts into the DB - just to see the results and it was a no go. I don't want '?' replacing the unknown chars. I would rather strip them or replace them with ' ' but I haven't been able to get that to work (using the second bit of code) Any ideas on what I am doing wrong? Thanks,
|
 |
 |
|
|
subject: character encoding (unicode to utf-8) conversion problem
|
|
|