aspose file tools*
The moose likes Java in General and the fly likes Read  Arabic text in Servlet Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Read  Arabic text in Servlet" Watch "Read  Arabic text in Servlet" New topic
Author

Read Arabic text in Servlet

carina caoor
Ranch Hand

Joined: Jun 23, 2007
Posts: 300

Hi In my application i am retrieving data from database which constitutes both arabic and english text , oracle stores the data in arabic format, when i retrieve the arabic data .. it just dispalys '???' in place of arabic text.where as english text is properly displayed.. i also tried adding
response.setCharacterEncoding("Windows-1256");
in servlet which gets the data from database but still it displays ???

any suggestions please..
Paul Sturrock
Bartender

Joined: Apr 14, 2004
Posts: 10336

Have a read through this. It is a very good introduction into all things to do with character encoding.


JavaRanch FAQ HowToAskQuestionsOnJavaRanch
carina caoor
Ranch Hand

Joined: Jun 23, 2007
Posts: 300

I am trying out a sample first here i changed the charset to ISO-8859-6 which supports arabic text now when i try to save this program in eclipse it says (tried even with the charset cp420 display same below error message)

Could not save file

some charectars could not be mapped using cp1252 character encoding either remove the charectars or change the encoding.(from eclipse)


from command prompt

Exception in thread "main" java.nio.charset.UnsupportedCharsetException: ISO-88
9-6
at java.nio.charset.Charset.forName(Unknown Source)
at Test.main(Test.java:17)

C:\Tomcat 5.5\webapps\AutomatedIntelligence>javac Test.java

C:\Tomcat 5.5\webapps\AutomatedIntelligence>java Test
Exception in thread "main" java.nio.charset.UnsupportedCharsetException: cp420
at java.nio.charset.Charset.forName(Unknown Source)
at Test.main(Test.java:17)





[ November 11, 2008: Message edited by: ruquia tabassum ]
Paul Sturrock
Bartender

Joined: Apr 14, 2004
Posts: 10336

Since this is not a Servlet question I'll move it to a more appropriate forum.
[ November 11, 2008: Message edited by: Paul Sturrock ]
Martijn Verburg
author
Bartender

Joined: Jun 24, 2003
Posts: 3274
    
    5

Hi there,

Sorry got a little confused by your last post. Are you saying that you are getting the ISO-8859-6 UnsupportedEncodingException from the Command Line and cp420 UnsupportedEncodingException from Eclipse?

Be aware of the default character encoding for eclipse files and the eclipse console (look under preferences).


Cheers, Martijn - Blog,
Twitter, PCGen, Ikasan, My The Well-Grounded Java Developer book!,
My start-up.
carina caoor
Ranch Hand

Joined: Jun 23, 2007
Posts: 300

when i run the above program from command prompt it gives me the below error

Exception in thread "main" java.nio.charset.UnsupportedCharsetException: ISO-885
9-6
at java.nio.charset.Charset.forName(Unknown Source)
at Test.main(Test.java:13)
[ November 12, 2008: Message edited by: ruquia tabassum ]
Martijn Verburg
author
Bartender

Joined: Jun 24, 2003
Posts: 3274
    
    5

Ah, I think that means that the shell you are running in does not support the ISO-8859-6 encoding. Are you running Windows, *NIX or ??

Also try changing the encoding to UTF-8 and see what happens.

Whether I'm in ISO-8859 or UTF-8 I don't get any errors but I do get:

??? (if I run from Eclipse)
����� (if I run from the command prompt)

Which quite clearly is not سيب but that's because I don't have either console set up to display that encoding properly .

Hope that helps steer you in the right direction...
Martijn Verburg
author
Bartender

Joined: Jun 24, 2003
Posts: 3274
    
    5

Originally posted by Paul Sturrock:
Have a read through this. It is a very good introduction into all things to do with character encoding.


Fantastic Article BTW, I re-read it once a year
carina caoor
Ranch Hand

Joined: Jun 23, 2007
Posts: 300

i changed my program and removed the characters سيب the program compiled correctly and also whne i checked for available charset supported i got ISO-8859-6 as one of the supported charset.
Martijn Verburg
author
Bartender

Joined: Jun 24, 2003
Posts: 3274
    
    5

Hi there,

What is your command line shell? Is it windows or *nix, what is the default encoding for that shell?
carina caoor
Ranch Hand

Joined: Jun 23, 2007
Posts: 300

my command shell is windows ... and how do i know the default encoding?
Gamini Sirisena
Ranch Hand

Joined: Aug 05, 2008
Posts: 347
Regarding your sample program.. Are you perhaps making a typo?

I noticed the exception shows "ISO-889-6" and not "ISO-8859-6"

You can use chcp in your command line in windows to check the default code page.

Mine gives..

C:\>chcp
Active code page: 437

But I think this is the code page used for the command line window for display purposes.

System.out.println(System.getProperty("file.encoding"))
shows the code page to be Cp1252

This article is a good read that should clear up a few things.
carina caoor
Ranch Hand

Joined: Jun 23, 2007
Posts: 300

In servlet i set my response.setCharsetEncoding("UTF-8");

n Yipeeeeeeeeee......... i got the arabic text..
carina caoor
Ranch Hand

Joined: Jun 23, 2007
Posts: 300

Now when i try to generate a xml file from my servlet the xml again shows ??? instead of the arabic text i changed the encoding in xml file as ISO-8859-6 but still the xml file shows ???

where as when i generate a excel file the arabic text is properly getting printed as arabic text..
Gamini Sirisena
Ranch Hand

Joined: Aug 05, 2008
Posts: 347
How are you viewing the XML file? What is the viewer?

How do you write the XML file to disk? Are you specifying a character encoding for it?
Martijn Verburg
author
Bartender

Joined: Jun 24, 2003
Posts: 3274
    
    5

Well done on making progress!

You need to ensure that:

A.) When ever you 'shift' the data to a new medium that you encode it correctly (e.g. UTF-8), some 'mediums' have a default encoding so it's always safest to explicitly set this.

B.) When you are viewing the data you also need to ensure that your viewer (whether it be the command line, Eclipse, Notepad, whatever) supports displaying the data in the encoding that you've chosen.

Hint, you probably need to look at point A.) for your XML
carina caoor
Ranch Hand

Joined: Jun 23, 2007
Posts: 300

In my xml i specify the encoding as ISO-8859-6 which supports arabic this xml is created dynamically when servlet runs.

servlet


whenever i run servlet it generates the xml ouput as


seeing this i tried to write a (static)separate xml file with encoding as ISO-8859-6 and inside <cell> tag i specified some arabic text ... now when i open this xml in browser i can see the arabic text ..which states that iso-8859-6 supports arabic text....


but why the text is not getting displayed when the xml is generated dynamically as servlet output...
any syggestions please..
Gamini Sirisena
Ranch Hand

Joined: Aug 05, 2008
Posts: 347
When you do response.setCharacterEncoding("UTF-8");

you shouldn't do out.println("<?xml version=\"1.0\" encoding=\"ISO-8859-6\"?>");

you should do out.println("<?xml version=\"1.0\" encoding=\"UTF-8\"?>");

Same logic follows for any other encoding you want to use.

However XML is meant to be unicode so using UTF-8 or UTF-16 would be a good idea.
carina caoor
Ranch Hand

Joined: Jun 23, 2007
Posts: 300

I tried both the combinations and no change i still get the same ???

servlet

output
Gamini Sirisena
Ranch Hand

Joined: Aug 05, 2008
Posts: 347
Well this has got me stumped too.

How are you getting the arabic text that is sent out? Is it hard coded for this test? Or are you getting it from some other source?
Martijn Verburg
author
Bartender

Joined: Jun 24, 2003
Posts: 3274
    
    5

Also how are you viewing that end result, refer to my point B.) earlier, it may be that your 'viewer' cannot display those characters (even though you've encoded them correctly).
carina caoor
Ranch Hand

Joined: Jun 23, 2007
Posts: 300

i am getting the arabic text from oracle databse .....

when i get the text and generate an excel file with that arabic text the text gets displayed as arabic characters in the excel file generated...

where as when i try to generate an xml file the arabic charactars are displayed as ???


this shows that the data what i am retreiving from oracle database is properly formatted




Gamini Sirisena
Ranch Hand

Joined: Aug 05, 2008
Posts: 347
Does this code prompt you to download the output as a file?

If so then one possibility is that as Martijn said the Excel viewer is able to correctly interpret and render UTF-8 while the whatever the viewer you are using for the XML file is unable to correctly interpret and render the UTF-8.

Also another wild guess... try doing PrintWriter out = response.getWriter(); in your XML outputting code too.
Gamini Sirisena
Ranch Hand

Joined: Aug 05, 2008
Posts: 347
So.. you could try not calling response.setContentType("application/xml"); and just calling response.setCharacterEncoding("UTF-8"); to see if it renders correctly on a web browser.

Read the Servlet API for more information on this.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18129
    
    8

I notice that nobody has posted a link to Character Conversions from Browser to Database yet. It covers pretty much everything discussed in this thread so far.

And I would recommend just using UTF-8. Using ISO-8859-6 is just going to lead to problems with systems that can't use it, and it's unnecessary because UTF-8 works perfectly well for Arabic scripts.
Gamini Sirisena
Ranch Hand

Joined: Aug 05, 2008
Posts: 347
I did too!

May be the link was too samll to see..

Anyhow, how are you getting on ruquia?
carina caoor
Ranch Hand

Joined: Jun 23, 2007
Posts: 300

Hi
When i commented out



output




it prints ???
i changed the setting in Internet Explorer View-->encoding-->AutoSelect and also View-->encoding-->Arabic(windows), it still prints ???
[ November 15, 2008: Message edited by: ruquia tabassum ]
Gamini Sirisena
Ranch Hand

Joined: Aug 05, 2008
Posts: 347
Well I've run out of ideas? Anybody else can help? Any progress, Ruquia?
carina caoor
Ranch Hand

Joined: Jun 23, 2007
Posts: 300

Hey good to see the reply...
After all my struggling i could generate the arabic text in .xml and in .xls files

.xml file generation



This could generate an xml file with proper output




Now when i am generating the .xls file
which contains 14 columns if i print only 12 columns the arabic text is properly getting printed when i print all the 14 columns it do not print the arabic text ... i dont no whats the reason for this strange behaviour

.xls file generation



As you can see from the above code i am not printing PropertCount and ConnectionFee in the excel and this code prints the arabic text prperly in the generated excel file . Now when i add PropertyCount & ConnectionFee it doesnot print the arabic text




This strange behaviour is eating up my time

output in .xls file with 14 columns(PropertyCount & ConnectionFee added)

carina caoor
Ranch Hand

Joined: Jun 23, 2007
Posts: 300

Atlast i could solve it now i have got the arabic text in both the xml and the .xls files...
i used hssf workbook and everything was fine Thank Lord...
Doug Douglas
Greenhorn

Joined: Aug 04, 2011
Posts: 2
rocx sum wrote:Atlast i could solve it now i have got the arabic text in both the xml and the .xls files...
i used hssf workbook and everything was fine Thank Lord...


I've same problem and spended 2 days. Please be more clear what is the solution??
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Read Arabic text in Servlet
 
Similar Threads
Java I/O Performance
Arabic text from oracle database into servlet
applet send arabic to servlet
pageEncoding="UTF-8" in jsp
Encoding problem in servlet