GeeCON Prague 2014*
The moose likes Beginning Java and the fly likes Problems with non-english characters Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Java » Beginning Java
Bookmark "Problems with non-english characters" Watch "Problems with non-english characters" New topic
Author

Problems with non-english characters

Unnar Björnsson
Ranch Hand

Joined: Apr 30, 2005
Posts: 164
Im trying to get my application to print the contents of a folder using cmd.exe. It works with one exception. Im icelandic and some of my folder names contain icelandic characters like '�' '�' '�' '�'...
The application initilizes the String currentDir with this method:


currentDir holds the path which should be applied after the command "dir" in cmd.exe so when I type "ls" (which my shells command for "dir") it shows the content of the folder located in currentDir.
In my case the currentDir is: C:\Documents and Settings\Torquemada\My Documents\Javaskr�r\Unnar\Verkefni 1 st�rikerfi which includes 2 icelandic characters '�' and '�' so when I execute "ls" nothing happents.
I made new string: testString = "C:\\Documents and Settings\\Torquemada\\My Documents\\Javaskr�r\\Unnar\\Verkefni 1 st�rikerfi" and executed "ls" with testString as argument instead of currentDir and everything was fine. I even printed both the strings with System.out.println() and got this:

C:\Documents and Settings\Torquemada\My Documents\Javaskr�r\Unnar\Verkefni 1 sty
rikerfi
C:\Documents and Settings\Torquemada\My Documents\Javaskr�r\Unnar\Verkefni 1 st�
rikerfi
Equal? - false

The lower string is the one that works but the upper one that displays the string correctly doesn�t.

How do I fix it?
[ February 04, 2006: Message edited by: Unnar Bj�rnsson ]
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18570
    
    8

Oh, I see. It took me a while to understand. You are executing the "cd" command and reading its output so you can get the current working directory. When you do that, you use something that doesn't use the system's default charset as Java understands it, and then you convert its output using the system's default charset. So any non-ASCII characters are converted using one scheme and then converted back using a different scheme. Hence the errors.

There's probably a way to fix that, but it would be easier to use a less horribly convoluted way of getting the current working directory. Like one ofAnd if you were going to continue on with more Runtime.exec() calls to "dir" or "ls", to find the files in that directory, please look up the methods in the java.io.File class that allow you to get the files in a directory inside Java without having to use OS-specific hacks like that.
Unnar Björnsson
Ranch Hand

Joined: Apr 30, 2005
Posts: 164
That looks more promising, thanks!
 
GeeCON Prague 2014
 
subject: Problems with non-english characters