This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.
What is filename encoding when you create a file in below way:
String filename = "name including chinese charaters"
File file = new File("filename");
In java class, the filename string is a string including chinese characters.
I think the filename encoding maybe affected by the os chartset. But when the file.encoding is CP-1252 and run the class , then create a file and the filename is well without garbled characters.
So my question is which factor will affect the filename encoding?
I'm confused by a couple of points in your question. First, the file encoding being CP-1252. What file is encoded that way? If it's the Java source file, then I don't even think you could save the code that contains Chinese characters, but I could be wrong. If it's the encoding on the file you are creating, that would affect the contents of the file, not its name. Are you ever seeing garbled characters? If so, where? In a command line directory listing? In a graphical file explorer? In an IDE?
For what it's worth, I tried to put 恭贺新禧 into my Java source file inside the Eclipse IDE. In order to save the file, I had to change the file properties to set the encoding to UTF-16 or UTF-8 instead of Cp1252. In order to see the characters display in the Eclipse window, I had to change the font for the Java editor to Arial Unicode MS (I'm on Windows at the moment) and the Script to Chinese-GB2312.
I've never really understood that Script setting and how it relates to Unicode. It seems to me if I have a character code for 恭 (606D), and the font has a character matching that code, it should display it. Why do I have to tell it what script to use? Maybe someone can answer that for both of us!
I believe all modern operating systems have the ability to store foreign characters in the file name. They might not always be able to display them in all cases though. For example, in Windows you might have to enable Asian Language Support before the system fonts used in the file explorer or the command window would be updated to show you Chinese character file names. That's why I asked Kevin where he's seeing garbled characters. Just to be sure though Kevin, what OS are you using?
It's also possible that Kevin is asking about what happens when you have Java source code containing non-ASCII characters. Presumably that source code, which is a text file, should be interpreted by the compiler as a text file in a certain encoding. If that encoding doesn't match the encoding which the editor was using when it created the file, then yes, problems are going to arise.
I’ve looked at a lot of different solutions, and in my humble opinion Aspose is the way to go. Here’s the link: http://aspose.com