aspose file tools*
The moose likes Java in General and the fly likes How to convert a character into unicode Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "How to convert a character into unicode" Watch "How to convert a character into unicode" New topic

How to convert a character into unicode

arun mahajan
Ranch Hand

Joined: Dec 07, 2001
Posts: 305
I have an arabic font available at my System. Now I want to convert a particular arabic word into unicode value. what I mean to ask in arabic say I wrote Salam and its unicode equivalaent is 0640 641064206410643 where 0640=S,0641=a,0642=l and 0643=m (all these values are assumptions not real one). Is there any class available in java which can get me these unicode values i.e. 0640 etc. Or I have to maintain a database of that langauage and need to compare it.
Pl help.
Dave Landers
Ranch Hand

Joined: Jul 24, 2002
Posts: 401
You can create a unicode literal using \uXXXX - for example (using your made-up unicode values):

If you want to get the numeric value of some character, you can use the char type:

The char can be cast to an int and then you can use it as any other number:
Cindy Glass
"The Hood"

Joined: Sep 29, 2000
Posts: 8521
You want to return a String representation of the unicode value?
The java compiler converts typed characters into unicode before converting then into bytecode. Most everything in the JRE is built with the intent of converting incoming unicode into it's correct character.
You will probably need to load the conversion mapping into a Map or whatever to get the job done.

"JavaRanch, where the deer and the Certified play" - David O'Meara
Jim Yingst

Joined: Jan 30, 2000
Posts: 18671
If you don't already know the Unicode values of the character you want to represent, Java does have the ability to convert from a variety of other encodings. You would need to know some way of representing "Salam" in a text file of some sort - preferably using one of these encodings. And you'd need to know which encoding had been used. Let's say you have a file saying "Salam" using Cp420, which is IBM Arabic. Then, to read the file contents and convert them to a Unicode String:

There are several classes that allow you to specify encoding conversions like this - InputStreamReader, OutputStreamReader, String (in some constructors and in getBytes()), Channels (newReader() and newWriter()). Which you use depends on whether you are using streams, byte arrays, or channels. Hope that helps.
[ August 20, 2002: Message edited by: Jim Yingst ]

"I'm not back." - Bill Harding, Twister
Brendan Fosberry
Ranch Hand

Joined: Dec 16, 2005
Posts: 33
This is similar to a problem Im having at the minute. I dont know too much about fonts and charsets but at the minute my server program takes an email and extracts some text from it. The text is stored in a string and then exported to a PDF. It works for english. yay. Now I need it to be able to handle arabic letters.

Should this work if a font that can handle arabic is used? or do I need to do some kind of character set conversion if the test is arabic?

I agree. Here's the link:
subject: How to convert a character into unicode