File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Problem with è character in Java.

 
Samir Banerjee
Ranch Hand
Posts: 72
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
I have a string in database 'Anikèt'. Now when I fetch this in Java it returns 'Anik�t'. How can I get the actual value or something else is there in java which can give me 'e'?
Thanks in advance!!!
Aniket
 
Raghavan Muthu
Ranch Hand
Posts: 3381
Mac MySQL Database Tomcat Server
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
One thing you can do with the appropriate locale for preserving the character set.

Other thing comes to my mind at the right moment is to use getBytes() method which gives you the byte equivalent of the actual data stuffed.
 
Ulf Dittmer
Rancher
Pie
Posts: 42966
73
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What is the "it" in "it returns"? Where are you seeing this? Note that consoles, terminals and simple-minded text editors may not be able to display umlauts properly.
 
Samir Banerjee
Ranch Hand
Posts: 72
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Ulf for the reply.

Actually I have a prepared statement in Java eg.



So if the name is 'Anikèt' the value I get is 'Anik�t'.
Also I have tried getBytes(UTF-8); but returns some garbage value.

Note : The name column in DB is varchar.

Let me know if you need more info.
 
Ulf Dittmer
Rancher
Pie
Posts: 42966
73
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You're missing what I was getting at; where do you see the character? Is it a piece of software that supports accented characters?
 
Raghavan Muthu
Ranch Hand
Posts: 3381
Mac MySQL Database Tomcat Server
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Adding on top of what Ulf asks, he indeed means to ask how and where do you really use the fetched data stored in the variable name?.

Do you use some System.out.println for printing in the console or use any logger to print in the file etc?
 
Samir Banerjee
Ranch Hand
Posts: 72
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I write this value on a excel file. And the wrong values is shown in excel itself.
Also I use Apache POI to do that.But the I could see the string object that I writing contains the wrong value when it is fetched from DB itself.
 
Ulf Dittmer
Rancher
Pie
Posts: 42966
73
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I write this value on a excel file. And the wrong values is shown in excel itself.

This may or may not preserve accented characters correctly; I wouldn't count it as proof that something is wrong.

But the I could see the string object that I writing contains the wrong value when it is fetched from DB itself.

I still don't know where you "see" that - have you examined the character code? If so, post the code you used to do that.
 
Samir Banerjee
Ranch Hand
Posts: 72
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What I mean from "see" is when I debug the code at :
String name = rs.getString(1);
When I inspect the value of name it shows 'è' as '�.
I tried :

Now this converts '�' -> '?'. Obviously this is not correct.
And this value I can see when I inspect the object, or I do sysout or when I print this on an excel.
I hope this time its clear. :P
 
Ulf Dittmer
Rancher
Pie
Posts: 42966
73
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes, things are clearer now - unfortunately, what has become clear is that you're making a couple of mistakes :-)

or I do sysout

See my earlier comments about consoles and terminals not being able to handle accented characters. So this counts for nothing until you've convinced us that wherever the sysout output goes to -which you haven't told us- can display those characters.

byte[] cn = name.getBytes();
String nm = new String(cn);

Never, ever, use "String.getBytes()" or "new String(byte[])", unless you know what the platform default encoding is, and that encodings do not matter - which is not the case if you're dealing with accented characters. You simply must specify the encoding by using "String.getBytes(String)" and "new String(byte[], String)". (I don't actually understand what the point of those two lines of code is - they're doing the inverse of one another. Or rather, they would, if encodings were handled correctly.)

The fastest way to check if things are OK is to print out the numerical character codes of "name".
 
Samir Banerjee
Ranch Hand
Posts: 72
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have also tried :





But no luck . What else should I try?

 
Ulf Dittmer
Rancher
Pie
Posts: 42966
73
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It seems that you're now trying random things in order to make this work. That's not the way to success with software development. For example, US-ASCII does not contain accented characters, so whatever you're doing with that can't possibly work. And as I said, the getBytes call and the String constructor call are completely unnecessary. What you should do is what I've advised twice by now: print the numerical character codes of all characters of the string in question (without any unnecessary transformations in between).

I also strongly advise to beef up your knowledge of character encodings; start with The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).
 
Samir Banerjee
Ranch Hand
Posts: 72
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Ulf
 
Raghavan Muthu
Ranch Hand
Posts: 3381
Mac MySQL Database Tomcat Server
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ulf Dittmer wrote:
Never, ever, use "String.getBytes()" or "new String(byte[])", unless you know what the platform default encoding is, and that encodings do not matter - which is not the case if you're dealing with accented characters. You simply must specify the encoding by using "String.getBytes(String)" and "new String(byte[], String)". (I don't actually understand what the point of those two lines of code is - they're doing the inverse of one another. Or rather, they would, if encodings were handled correctly.)

The fastest way to check if things are OK is to print out the numerical character codes of "name".


Thank you Ulf for the info and tips!
 
Martijn Verburg
author
Bartender
Posts: 3275
5
Eclipse IDE Java Mac OS X
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ulf Dittmer wrote:I also strongly advise to beef up your knowledge of character encodings; start with The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).


That article is awesome.
 
Samir Banerjee
Ranch Hand
Posts: 72
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Truly...resolved my issue too ...Thanks....
 
I agree. Here's the link: http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic