File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Beginning Java and the fly likes character code point Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Android Security Essentials Live Lessons this week in the Android forum!
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "character code point" Watch "character code point" New topic
Author

character code point

Kevin Tysen
Ranch Hand

Joined: Oct 12, 2005
Posts: 255
I was looking at the methods in the class Character. I need a simple method to change a char to an int in Unicode. But there is no such method. I just found a few methods, including one called toCodePoint(char high, char low), which deal with surrogate pairs. According to the API, toCodePoint "converts the specified surrogate pair to its supplementary code point value." Why can't you just change a char to a number? And what are surrogate pairs?
marc weber
Sheriff

Joined: Aug 31, 2004
Posts: 11343

Originally posted by Kevin Tysen:
... Why can't you just change a char to a number? ...

Like this...?

char c = 'x';
int i = c;

The original Unicode specification used 16-bit values, and this is what Java's char type was based on. This allows for values between 0 and 65535 (that is, 0 and 2^16 - 1), which is also written as U+0000 and U+FFFF.

However, the Unicode specification has since been expanded to allow for values up to U+10FFFF, with values above the 16-bit limit of U+FFFF called "supplementary characters." In Java, supplementary characters are represented as a pair of char values. The first of these is called the "high surrogate" and the second is the "low surrogate."

For a char value that that does not use surrogate pairs, you can simply widen to type int with an assignment conversion (as shown above).
[ August 06, 2007: Message edited by: marc weber ]

"We're kind of on the level of crossword puzzle writers... And no one ever goes to them and gives them an award." ~Joe Strummer
sscce.org
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41123
    
  45
There are a couple of blog posts that give a brief introduction into surrogate pairs and how to handle them: Tom White and John O'Conner


Ping & DNS - my free Android networking tools app
Raghavan Muthu
Ranch Hand

Joined: Apr 20, 2006
Posts: 3344

Yeah thats great.

Thanks Marc for the explanation and Ulf for the links.


Everything has got its own deadline including one's EGO!
[CodeBarn] [Java Concepts-easily] [Corey's articles] [SCJP-SUN] [Servlet Examples] [Java Beginners FAQ] [Sun-Java Tutorials] [Java Coding Guidelines]
Kevin Tysen
Ranch Hand

Joined: Oct 12, 2005
Posts: 255
Oh, I see. I didn't know you could just change a char to an int so directly. That is very simple. Now it makes sense. Thank you.
marc weber
Sheriff

Joined: Aug 31, 2004
Posts: 11343

Originally posted by Kevin Tysen:
Oh, I see. I didn't know you could just change a char to an int so directly...

Underneath it all, a 16-bit char is really a numeric value that's represented as a character symbol.
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: character code point
 
Similar Threads
How can I trim a non-breaking space?
Java doc on Strings
Change table column sizes and horizontal scoll not working
How do I escape special chars like German "Umlaut" to UTF-8 entities?
How to set Javadoc loaction for F2 tool tip?