aspose file tools*
The moose likes Swing / AWT / SWT and the fly likes international characters Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Java » Swing / AWT / SWT
Bookmark "international characters" Watch "international characters" New topic
Author

international characters

Mortimer Mousse
Greenhorn

Joined: Apr 13, 2005
Posts: 6
First of all, I'd like to apologize because I know that my question probably has very simple answer and that it has been answered somewhere in this forum. I have found a few posts that deal with it, but somehow the solution still eludes me.

I have a simple Java application that I'm developing with Eclipse and SWT. One of the features in the application is reading data from a text file and displaying it on the label.

Data is not in English and it uses some non-standard characters, for instance "čekić". The application reads this word and writes out "?eki?".
Even if I put in the code itself "label.setText("čekić")" it still displays those darn question marks.

I have tried many things but I simply cannot make it write these characters properly. Can someone PLEASE tell me, like I'm a 5-year old, what exactly do I have to do to display this chars correctly. Please help me people, I'm going nuts here.

Thanks in advance. Cheers!
Avi Abrami
Ranch Hand

Joined: Oct 11, 2000
Posts: 1134

Mort,
I don't have an answer for you -- sorry
But in case you haven't read it, the following blog may be of help:

http://weblogs.java.net/blog/tomwhite/archive/2005/03/counting_charac.html

Good Luck,
Avi.

P.S. Just out of curiosity, would you mind telling me what language that word is in -- and how you pronounce it. (Thanks)
[ April 13, 2005: Message edited by: Avi Abrami ]
Lionel Badiou
Ranch Hand

Joined: Jan 06, 2005
Posts: 140
Hello,

Did you check the font you use ? some "exotic" languages require special fonts to be displayed correctly.

Hope that helps,


Lionel Badiou
CodeFutures Software
Mortimer Mousse
Greenhorn

Joined: Apr 13, 2005
Posts: 6
The word "čekić" is a croatian word meaning "hammer". The first letter is pronounced "ch" as in "change". The last sounds simmilar but is spoken more softly - like the first syllable in a spoken italian word "ciao". The closest pronounciation would be "checkeech" (but not "check-eech", you must say those two parts together with the first part slightly accentutated, like the words "baby" or "happy"). There, todays lesson in Croatian language is over.

As for the soultion I still don't have one and am still asking for help . I have found an interim solution which is breaking every word into characters and inserting a Unicode instead of the actual special character. It works, but is awfully inelegant and cumbersome.
[ April 14, 2005: Message edited by: Mortimer Mousse ]
Raja Kannappan
Ranch Hand

Joined: May 08, 2002
Posts: 83
The data in text file must be in unicode format and you must read the string from the file as unicode. Now you can display the string in the label.

If your data is not in unicode format, you can try converting it to unicode or maybe you can find some conversion routines online.

Hope it helps,

- Raja.


SCJP
Jared Cope
Ranch Hand

Joined: Aug 18, 2004
Posts: 243
Hi,

First of all, a couple of hints.

1. When you see the '?' instead of another character that you expect, it means that the encoding system used to read the bytes could not figure out what the bytes should be mapped to. '?' means encoding problem

2. If you see the square box instead of the character you expect, then it usually means that the font you are using to display the characters does not have a representation for the character that you need to display.

As for solutions, I suggest the following:

1. Try setting the label statically with unicode at compile time to see if you can get it to look right. For example


2. When reading in the characters from the file, make sure that you read
the bytes from the file using the correct encoding. For example:


This means that you need to know the encoding of the file. This could be UTF-8, ISO-* or any of the others supported by java and described at: http://java.sun.com/j2se/1.4.2/docs/guide/intl/encoding.doc.html

Working with international text is not my favourite part of Java programming. But does get easier once you get over a few hurdles.

Best of luck,

Cheers, Jared.


SCJP 1.4 91%, SCJP 1.5 88%, SCJD B&S
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: international characters