aspose file tools*
The moose likes Beginning Java and the fly likes  Chinese Filename Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Beginning Java
Bookmark " Chinese Filename" Watch " Chinese Filename" New topic
Author

Chinese Filename

Keith Wong
Greenhorn

Joined: Jun 03, 2008
Posts: 7
Hello everyone,
I created a file named 你好.txt. I get into trouble when I try to running java in dos read filename from file.list() and toString(), it returns ??.txt. Here is source:

I have tried getBytes(Charset charset) in difference charset of UT-8, UTC-16 and Big5. But no luck

The file show chinese in window explore and able to be viewed in dos when I execute "dir" command.

Thanks for any help.

Keith
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42595
    
  65
Java strings are in Unicode - is the console capable of displaying Unicode? I'm pretty certain that Windows Explorer uses some other character encoding internally, so it doesn't mean much that it can display those characters.


Ping & DNS - my free Android networking tools app
Keith Wong
Greenhorn

Joined: Jun 03, 2008
Posts: 7
Of Cause, DOS able to display. As I stated, dir command show the 你好.txt

Thanks,
Keith
Mike Simmons
Ranch Hand

Joined: Mar 05, 2008
Posts: 3018
    
  10
[Ulf]: Java strings are in Unicode - is the console capable of displaying Unicode?

Internally a String is stored in Unicode. However when it's written to a console, it's generally converted to the platform's default encoding. For some encodings it's impossible to represent some characters, e.g. it's impossible to represent most all Asian characters in ISO-8859-1. However for a machine configured for a Chinese locale, it's quite possible that Chinese characters are supported by the default encoding - and that appears to be the case here.

[Keith]: Of Cause, DOS able to display

Did you mean "of course"? It's not at all obvious to those of us in Western countries - our environments are usually not set up to be able to display such characters.

[Keith]: I have tried getBytes(Charset charset) in difference charset of UT-8, UTC-16 and Big5.

I'm not familiar with UT-8 or UTC-16. I don't think your computer is either. Have you tried UTF-8 or UTF-16? Also, have you tried using getBytes() with no arguments? This should use your system's default encoding. Still, I think that's what your original program shown above should do too, so I don't know why it doesn't work if your console is indeed able to display these characters.

You might try looking at all your environment variables and system properties. One of these may be the name of the encoding you need to use here. I don't have a windows box available right now to look to see what environment variables are used there. And in any event my Windows system wouldn't be able to display chinese characters in the command prompt. (My Mac can, sure - but that's no use to you.) You'll have to look at the environment variables yourself to see if you can find one that looks significant. Good luck...
Keith Wong
Greenhorn

Joined: Jun 03, 2008
Posts: 7
Forgot to mention, the computer has been set to Chinese display from following instruction.
http://blog.wensheng.com/2005/05/dos-chinese-under-window-xp.html

I had mistype the UTF-8 or UTF-16 instead of UT-8 or UTC-16. I had tried both with Big5 as well.

Thanks,
Keith
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Chinese Filename