• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

convert given unicode literal

 
ilkin esrefli
Greenhorn
Posts: 26
Chrome Java Spring
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Helo ranchers, I need your help. I need to convert given unicode literal value to unicode character. How can I do it? For example:




Please help for writing convert() method.

Thanks,
 
John Jai
Rancher
Posts: 1776
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
How do you compute \\u0259 -> corresponds to a particular unicode value.

If there is no available mapping in API (I don't know if it's present) you can have all the mapping in a properties file, read it in your application, parse the String you want to check and replace them with the value configured.
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Run this program. Make sure you understand why you get the results you do.


 
ilkin esrefli
Greenhorn
Posts: 26
Chrome Java Spring
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
So I need to know how below application convert it:
http://rishida.net/tools/conversion/?q=bel\\u0259
 
John Jai
Rancher
Posts: 1776
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Did you try Jeff's code?
 
ilkin esrefli
Greenhorn
Posts: 26
Chrome Java Spring
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I know differences between \\u0259 and \u0259. Maybe I couldn't explain. Try to explain like that:

below example input value is bel\u0259

run as

java Java2Uni bel\u0259
result: belə


 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
ilkin esrefli wrote:I know differences between \\u0259 and \u0259.


Your first post makes it pretty clear that you don't.

Maybe I couldn't explain. Try to explain like that:

below example input value is bel\u0259

run as

java Java2Uni bel\u0259
result: belə




Ah, so you don't want to provide unicode escapes in your source code. You want to provide them at runtime and parse them yourself. Note that the escape sequence \u is not significant in java.lang.String, but rather, in string literals in the compiler.

So if you want to provide the characters '\', 'u', '0', '2', '5', '9' in a java.lang.String and convert that to a java.lang.String containing just the character 'ə', then I think you have to do it yourself. I don't think there's any built in facility to do that. You would need to recognize the '\' and 'u' characters, and then take the next 4 characters, get their int value, such as Integer.parseInt("0259", 16), and then do something like with the result, although I'm not sure if that would work in the general case, or if you'd need to use http://docs.oracle.com/javase/6/docs/api/java/lang/Character.html#toChars(int)
 
Claudiu Chelemen
Ranch Hand
Posts: 75
Eclipse IDE Java Oracle
  • 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You need to unescape your string.
I mentioned this class in a post earlier this week, could be useful in your case as well:

StringEscapeUtils

Cheers,
Claudiu
 
Harsha Smith
Ranch Hand
Posts: 287
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Harsha Smith wrote:


I think you're missing the point. "\u1234" as a String literal is not the same as a String created from characters '\', 'u', '1', '2', '3', '4', such as when reading "\u1234" from a text file. The first is interpreted by the compiler into a String with a single char in it. There's no parsing done at runtime.

The second case results in a String with the 5 characters '\', 'u', '1', '2', '3', '4' in it, which then needs to be further parsed at runtime if we want to end up with that same single-character String. This second case seems to be what the OP is asking about.
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic