• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Problem encoding japanese character

 
Ranch Hand
Posts: 31
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi All,

I have had good solutions for my problems here. I hope to get one for the problem i am facing now.

I have to store japanese data in postgres database. Mine is a web application. I get the values using request.getParameterMap(). When we pass japanese character in request, the value that is retrieved has some junk characters [like è£?ç½®ã?®ã??ã??ã?®è£?ç½®]. I am not sure how to convert this into a proper japanese character to store in database. I searched on net and tried to do something using the below code. But i am getting the output like this

�?置�?��??�??�?��?置




Can someone help me to do this??? Thanks in advance...
 
Ranch Hand
Posts: 781
Netbeans IDE Ubuntu Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Hema Nandhini wrote:Hi All,

I have had good solutions for my problems here. I hope to get one for the problem i am facing now.

I have to store japanese data in postgres database. Mine is a web application. I get the values using request.getParameterMap(). When we pass japanese character in request, the value that is retrieved has some junk characters [like è£?ç½®ã?®ã??ã??ã?®è£?ç½®]. I am not sure how to convert this into a proper japanese character to store in database. I searched on net and tried to do something using the below code. But i am getting the output like this

�?置�?��??�??�?��?置




Can someone help me to do this??? Thanks in advance...



Strings in Java are UNICODE encoded as UTF-16 always always ALWAYS. There is no such thing as a UTF-8 String and you should not need to do any conversion at all so your method is pointless. Your JDBC driver should perform any character encoding required.
 
Ranch Hand
Posts: 1051
Eclipse IDE Firefox Browser
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
we can use Sql to do this..but if the charaters are not so long so we can use one of the java hashmap to do this.

E.g.


This is efficiently what a database would do anyways
 
James Sabre
Ranch Hand
Posts: 781
Netbeans IDE Ubuntu Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

shanky sohar wrote:we can use Sql to do this..but if the charaters are not so long so we can use one of the java hashmap to do this.



I fail to see the relevance of this in relation to the original problem. The original problem is a false assumption that a String object can be encoded as UTF-8 when it is always UNICODE encoded as UTF-16!
 
Hema Nandhini
Ranch Hand
Posts: 31
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks for your replies...

Now, my database [Postgres] is created with Encoding as "UTF8", Collation as "POSIX" and Ctype as "POSIX". Without doing any conversion myself, the data is inserted as is ie., the junk data è£?ç½®ã?®ã??ã??ã?®è£?ç½® is inserted. I also tried to do "SET CLIENT_ENCODING TO UTF8" before insert. But still no improvement...

Is there something that i am missing here??
 
James Sabre
Ranch Hand
Posts: 781
Netbeans IDE Ubuntu Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Hema Nandhini wrote:Thanks for your replies...

Now, my database [Postgres] is created with Encoding as "UTF8", Collation as "POSIX" and Ctype as "POSIX". Without doing any conversion myself, the data is inserted as is ie., the junk data è£?ç½®ã?®ã??ã??ã?®è£?ç½® is inserted. I also tried to do "SET CLIENT_ENCODING TO UTF8" before insert. But still no improvement...

Is there something that i am missing here??



One possibility - does the software being used to view the database know how to display utf-8 encoded data and does it have a font with glyphs for Japanese characters? Can you view the table contents as hex encoded bytes and see if the content corresponds to utf-8 characters?
 
Bartender
Posts: 10336
Hibernate Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
This is always good background reading for people struggling with this sort of issue. Not a direct answer to your issue but still very much worth a read.
 
Hema Nandhini
Ranch Hand
Posts: 31
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks James and Paul... the link was really helpful. "struggling" was exactly the word for me that day. Now i am clear on the encoding concepts and i was able to get it right in POST requests. My storage and retrieval all worked well in POST request. However, i was not able to make it in GET. request.getParameterMap() returns junk values when the input is given thru javascript encodeURIComponent... i am trying. Will post the solution if ever i find the solution.
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic