aspose file tools*
The moose likes Java in General and the fly likes Encode/Decode a String containing Chinese etc. characters to/from unicode Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Encode/Decode a String containing Chinese etc. characters to/from unicode" Watch "Encode/Decode a String containing Chinese etc. characters to/from unicode" New topic
Author

Encode/Decode a String containing Chinese etc. characters to/from unicode

anubha verma
Greenhorn

Joined: Aug 06, 2009
Posts: 1
Hi, I am working to encode a multilingual String (English, chinese, japanese, Latin etc) as unicode. The encoded string is used as follows
1)decoded by the User Interface for display purpose. The user interface is a web UI and
2)read by user. So it should be human readable when it contains only English and special characters.

Thus having a string that contains letters from English, Japanese, CHinese etc, we want the Japanese/Chinese characters to be encoded by the hex values such as 私 be encoded as %E7%A7%81.
However, it is preferable that other special characters like ! ? , space etc not be encoded and left as such.

The encoding of characters is achievable by using Java.net.URLEncoder but it also replaces all special characters including space character, which becomes a pain for the reader.
Unfortunately URLEncoder.java does not have any API to configure which characters to encode. Any suggestions how I can proceed, or what encoder i can use?

Thanks in advance!
Regards
dwarakanathan thiru
Ranch Hand

Joined: Oct 14, 2009
Posts: 49
Hi Anubha,
Even I am working on something like this.
I am trying to store some chinese charcters in a file and display the same on JSP page. No clue.
Alternate option is to store the unicode format into the file and decode it on JSP. Again, no clues.
any help?


Thanks,
Dwarak T
 
Don't get me started about those stupid light bulbs.
 
subject: Encode/Decode a String containing Chinese etc. characters to/from unicode