Win a copy of Mesos in Action this week in the Cloud/Virtualizaton forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

character code point

 
Kevin Tysen
Ranch Hand
Posts: 255
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I was looking at the methods in the class Character. I need a simple method to change a char to an int in Unicode. But there is no such method. I just found a few methods, including one called toCodePoint(char high, char low), which deal with surrogate pairs. According to the API, toCodePoint "converts the specified surrogate pair to its supplementary code point value." Why can't you just change a char to a number? And what are surrogate pairs?
 
marc weber
Sheriff
Posts: 11343
Java Mac Safari
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Kevin Tysen:
... Why can't you just change a char to a number? ...

Like this...?

char c = 'x';
int i = c;

The original Unicode specification used 16-bit values, and this is what Java's char type was based on. This allows for values between 0 and 65535 (that is, 0 and 2^16 - 1), which is also written as U+0000 and U+FFFF.

However, the Unicode specification has since been expanded to allow for values up to U+10FFFF, with values above the 16-bit limit of U+FFFF called "supplementary characters." In Java, supplementary characters are represented as a pair of char values. The first of these is called the "high surrogate" and the second is the "low surrogate."

For a char value that that does not use surrogate pairs, you can simply widen to type int with an assignment conversion (as shown above).
[ August 06, 2007: Message edited by: marc weber ]
 
Ulf Dittmer
Rancher
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There are a couple of blog posts that give a brief introduction into surrogate pairs and how to handle them: Tom White and John O'Conner
 
Raghavan Muthu
Ranch Hand
Posts: 3381
Mac MySQL Database Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yeah thats great.

Thanks Marc for the explanation and Ulf for the links.
 
Kevin Tysen
Ranch Hand
Posts: 255
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Oh, I see. I didn't know you could just change a char to an int so directly. That is very simple. Now it makes sense. Thank you.
 
marc weber
Sheriff
Posts: 11343
Java Mac Safari
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Kevin Tysen:
Oh, I see. I didn't know you could just change a char to an int so directly...

Underneath it all, a 16-bit char is really a numeric value that's represented as a character symbol.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic