aspose file tools*
The moose likes Java in General and the fly likes a matter of encoding... Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "a matter of encoding..." Watch "a matter of encoding..." New topic
Author

a matter of encoding...

Kareem Gad
Ranch Hand

Joined: Aug 06, 2001
Posts: 89
Hi All.. it's been a while since i've posted anything (lazy me) ..
I am posting this on the advanced forum as I did not find any other appropriate forum to ask ..

I have a dilemma of encoding .. in short :

A web application is required to save arabic data to an oracle database. Later an oracle report is run to display this data.. should display correctly ofcourse.

Long version:

on my development machine (weblogic8.1 on windows):
-web application using struts reads data input from a normal form, encoding set as windows-1256.
-web application saves data to database as windows-1256
-oracle report runs and reads data correctly. (NLS_LANG setting on Oracle report server is AMERICAN_AMERICA.AR8MSWIN1256)

deploying this same code to staging machine (weblogic8.1 on solaris):
-web application using struts reads data input from a normal form, encoding set as windows-1256.
-web application sees arabic form data as ??? (question marks)
-web application saves data to database as windows-1256 (but ofcourse only question marks get saved)
-oracle report runs and reads data correctly. (naturally just question marks appear) (NLS_LANG setting on Oracle report server is AMERICAN_AMERICA.AR8MSWIN1256)

Obviously on Solaris somehow the windows-1256 encoding is not understood for some reason. Although this page mentions how this encoding is supported with JVM 1.4.2 on all platforms, Solaris included.

I decided not to go down that unknown path of researching the differences between the JVM on my windows machine and on solaris and all the environment variables related.

Instead I decided on the approach of doing everything UTF8 (form -- submitting to struts application -- manipulating within the application) until the last step before writing to the DB, to convert that string from UTF8 to Cp1256 (which is windows-1256) ..

I need to write 1256 into the DB as this is (until further notice from our oracle developers team) the only way to read arabic properly with their current set up. They can't read UTF8 properly and are working on it.

Back to me, I'm having trouble doing that conversion the from UTF8 to Cp1256 of my strings before writing them to the DB.

Any hints ... even on a direction i should follow ?

Much appreciated ..


<b><i>KaReEm</i><br /><ul type="square"><li>SCJP-Free Range Web Developer <br /></ul></b>
Paul Sturrock
Bartender

Joined: Apr 14, 2004
Posts: 10336


I need to write 1256 into the DB as this is (until further notice from our oracle developers team) the only way to read arabic properly with their current set up. They can't read UTF8 properly and are working on it.


That makes me a bit suspicious. The simple way is to make the database support unicode, then you don't have to convert anything. Oracle will happily support UTF-8, unless you are using an unsupported older version of Oracle (7 and below). Oracle's "Database Globalization Support Guide" covers this.


JavaRanch FAQ HowToAskQuestionsOnJavaRanch
Kareem Gad
Ranch Hand

Joined: Aug 06, 2001
Posts: 89
Hi Paul,

No we are using Oracle 10g so the version's not an issue. It's the oracle developer "know-how" here that IS ... which confined us to the only working solution they came up with to show arabic out of windows-1256 encoding.

Anyway... I resolved the issue.. pheeeww ... after 2 long all-nighters!!

Turns out there was a default character set defined on the webserver through which i was accessing the application (deployed on the application server behind) on the staging environment.

The default encoding from the webserver was overriding whatever i set through my application. Let's just say it had the LAST word. So my strings ended up getting double encoded whenever i tried anything.

I fixed that and now all works fine. As i want/expect it to.

Note: for people who might reach this topic in the future I am talking about the AddDefaultCharSet directive which could be set in apache's httpd.conf

If you happen to be using another webserver check the documentation for that webserver on how to set the default encoding.

I'm a happy man
[ September 13, 2006: Message edited by: Kareem Gad ]
Paul Sturrock
Bartender

Joined: Apr 14, 2004
Posts: 10336


I'm a happy man

Good . And thanks for taking the time to post your solution.
Kareem Gad
Ranch Hand

Joined: Aug 06, 2001
Posts: 89
Originally posted by Paul Sturrock:

And thanks for taking the time to post your solution.


... it's a matter of respect

I get my solutions here in the saloon .. it's the least i can do ...

<--- just being silly
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: a matter of encoding...