• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Encode/Decode a String containing Chinese etc. characters to/from unicode

 
anubha verma
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi, I am working to encode a multilingual String (English, chinese, japanese, Latin etc) as unicode. The encoded string is used as follows
1)decoded by the User Interface for display purpose. The user interface is a web UI and
2)read by user. So it should be human readable when it contains only English and special characters.

Thus having a string that contains letters from English, Japanese, CHinese etc, we want the Japanese/Chinese characters to be encoded by the hex values such as 私 be encoded as %E7%A7%81.
However, it is preferable that other special characters like ! ? , space etc not be encoded and left as such.

The encoding of characters is achievable by using Java.net.URLEncoder but it also replaces all special characters including space character, which becomes a pain for the reader.
Unfortunately URLEncoder.java does not have any API to configure which characters to encode. Any suggestions how I can proceed, or what encoder i can use?

Thanks in advance!
Regards
 
dwarakanathan thiru
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Anubha,
Even I am working on something like this.
I am trying to store some chinese charcters in a file and display the same on JSP page. No clue.
Alternate option is to store the unicode format into the file and decode it on JSP. Again, no clues.
any help?
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic