aspose file tools*
The moose likes Beginning Java and the fly likes character code point Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "character code point" Watch "character code point" New topic
Author

character code point

Kevin Tysen
Ranch Hand

Joined: Oct 12, 2005
Posts: 255
I was looking at the methods in the class Character. I need a simple method to change a char to an int in Unicode. But there is no such method. I just found a few methods, including one called toCodePoint(char high, char low), which deal with surrogate pairs. According to the API, toCodePoint "converts the specified surrogate pair to its supplementary code point value." Why can't you just change a char to a number? And what are surrogate pairs?
marc weber
Sheriff

Joined: Aug 31, 2004
Posts: 11343

Originally posted by Kevin Tysen:
... Why can't you just change a char to a number? ...

Like this...?

char c = 'x';
int i = c;

The original Unicode specification used 16-bit values, and this is what Java's char type was based on. This allows for values between 0 and 65535 (that is, 0 and 2^16 - 1), which is also written as U+0000 and U+FFFF.

However, the Unicode specification has since been expanded to allow for values up to U+10FFFF, with values above the 16-bit limit of U+FFFF called "supplementary characters." In Java, supplementary characters are represented as a pair of char values. The first of these is called the "high surrogate" and the second is the "low surrogate."

For a char value that that does not use surrogate pairs, you can simply widen to type int with an assignment conversion (as shown above).
[ August 06, 2007: Message edited by: marc weber ]

"We're kind of on the level of crossword puzzle writers... And no one ever goes to them and gives them an award." ~Joe Strummer
sscce.org
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42594
    
  65
There are a couple of blog posts that give a brief introduction into surrogate pairs and how to handle them: Tom White and John O'Conner


Ping & DNS - my free Android networking tools app
Raghavan Muthu
Ranch Hand

Joined: Apr 20, 2006
Posts: 3355

Yeah thats great.

Thanks Marc for the explanation and Ulf for the links.


Everything has got its own deadline including one's EGO!
[CodeBarn] [Java Concepts-easily] [Corey's articles] [SCJP-SUN] [Servlet Examples] [Java Beginners FAQ] [Sun-Java Tutorials] [Java Coding Guidelines]
Kevin Tysen
Ranch Hand

Joined: Oct 12, 2005
Posts: 255
Oh, I see. I didn't know you could just change a char to an int so directly. That is very simple. Now it makes sense. Thank you.
marc weber
Sheriff

Joined: Aug 31, 2004
Posts: 11343

Originally posted by Kevin Tysen:
Oh, I see. I didn't know you could just change a char to an int so directly...

Underneath it all, a 16-bit char is really a numeric value that's represented as a character symbol.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: character code point