• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

How to convert a character into unicode

 
Ranch Hand
Posts: 305
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have an arabic font available at my System. Now I want to convert a particular arabic word into unicode value. what I mean to ask in arabic say I wrote Salam and its unicode equivalaent is 0640 641064206410643 where 0640=S,0641=a,0642=l and 0643=m (all these values are assumptions not real one). Is there any class available in java which can get me these unicode values i.e. 0640 etc. Or I have to maintain a database of that langauage and need to compare it.
Pl help.
regards,
Arun
 
Ranch Hand
Posts: 401
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You can create a unicode literal using \uXXXX - for example (using your made-up unicode values):

If you want to get the numeric value of some character, you can use the char type:

The char can be cast to an int and then you can use it as any other number:
 
"The Hood"
Posts: 8521
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You want to return a String representation of the unicode value?
The java compiler converts typed characters into unicode before converting then into bytecode. Most everything in the JRE is built with the intent of converting incoming unicode into it's correct character.
You will probably need to load the conversion mapping into a Map or whatever to get the job done.
 
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If you don't already know the Unicode values of the character you want to represent, Java does have the ability to convert from a variety of other encodings. You would need to know some way of representing "Salam" in a text file of some sort - preferably using one of these encodings. And you'd need to know which encoding had been used. Let's say you have a file saying "Salam" using Cp420, which is IBM Arabic. Then, to read the file contents and convert them to a Unicode String:

There are several classes that allow you to specify encoding conversions like this - InputStreamReader, OutputStreamReader, String (in some constructors and in getBytes()), Channels (newReader() and newWriter()). Which you use depends on whether you are using streams, byte arrays, or channels. Hope that helps.
[ August 20, 2002: Message edited by: Jim Yingst ]
 
Ranch Hand
Posts: 33
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
This is similar to a problem Im having at the minute. I dont know too much about fonts and charsets but at the minute my server program takes an email and extracts some text from it. The text is stored in a string and then exported to a PDF. It works for english. yay. Now I need it to be able to handle arabic letters.

Should this work if a font that can handle arabic is used? or do I need to do some kind of character set conversion if the test is arabic?

thanks
 
reply
    Bookmark Topic Watch Topic
  • New Topic