Win a copy of Design for the Mind this week in the Design forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

file name encoding problem

 
Kevin Ton
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What is filename encoding when you create a file in below way:
-----------------------------------------
String filename = "name including chinese charaters"
File file = new File("filename");
file.createNewFile();
-----------------------------------------

In java class, the filename string is a string including chinese characters.
I think the filename encoding maybe affected by the os chartset. But when the file.encoding is CP-1252 and run the class , then create a file and the filename is well without garbled characters.

So my question is which factor will affect the filename encoding?

Thanks,
Kevin
 
Greg Charles
Sheriff
Posts: 2985
12
Firefox Browser IntelliJ IDE Java Mac Ruby
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Kevin,

Welcome to Java Ranch!

I'm confused by a couple of points in your question. First, the file encoding being CP-1252. What file is encoded that way? If it's the Java source file, then I don't even think you could save the code that contains Chinese characters, but I could be wrong. If it's the encoding on the file you are creating, that would affect the contents of the file, not its name. Are you ever seeing garbled characters? If so, where? In a command line directory listing? In a graphical file explorer? In an IDE?

For what it's worth, I tried to put 恭贺新禧 into my Java source file inside the Eclipse IDE. In order to save the file, I had to change the file properties to set the encoding to UTF-16 or UTF-8 instead of Cp1252. In order to see the characters display in the Eclipse window, I had to change the font for the Java editor to Arial Unicode MS (I'm on Windows at the moment) and the Script to Chinese-GB2312.

I've never really understood that Script setting and how it relates to Unicode. It seems to me if I have a character code for 恭 (606D), and the font has a character matching that code, it should display it. Why do I have to tell it what script to use? Maybe someone can answer that for both of us!
 
Stephan van Hulst
Bartender
Pie
Posts: 5553
53
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Greg, I think Kevin was wondering what character set the OS uses to store file names in the file system tables, and if it can be influenced in any way by the user.

I don't have an answer I'm afraid though.
 
Greg Charles
Sheriff
Posts: 2985
12
Firefox Browser IntelliJ IDE Java Mac Ruby
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I believe all modern operating systems have the ability to store foreign characters in the file name. They might not always be able to display them in all cases though. For example, in Windows you might have to enable Asian Language Support before the system fonts used in the file explorer or the command window would be updated to show you Chinese character file names. That's why I asked Kevin where he's seeing garbled characters. Just to be sure though Kevin, what OS are you using?
 
Paul Clapham
Sheriff
Pie
Posts: 20953
31
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It's also possible that Kevin is asking about what happens when you have Java source code containing non-ASCII characters. Presumably that source code, which is a text file, should be interpreted by the compiler as a text file in a certain encoding. If that encoding doesn't match the encoding which the editor was using when it created the file, then yes, problems are going to arise.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic