File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
Win a copy of Clojure in Action this week in the Clojure forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

convert UTF8 encoded file to Unicode

 
Doug Cyporyn
Greenhorn
Posts: 2
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I need to read in a record and replace a character in the record with another character.
The file I am reading is encoded in UTF8 format. In Java, I can read the file and specify the encoding that is used.

//specify file and create input stream with proper encoding
File f = new File("c:\\gme_test.txt");
FileInputStream file = new FileInputStream(f);
InputStreamReader inReader = new InputStreamReader(file, "UTF8");
It is my understanding that when Java reads the file it will convert it to Unicode. So, when I do this:
//read a record from the file into a String object
BufferedReader inBuffer = new BufferedReader(inReader);
String aRecord = inBuffer.readLine();
The result should be a Unicode String. Now I should be able to convert a character in the string. I am using the Unicode literal for the character as follows:

//replace � with u
aRecord.replace('\u00FC', 'u');
The problem is that the character I am trying to replace is not found in the String. The string looks more like this:
R��sselsheim
Reading about UTF8, I have learned that some characters can only represented by two bytes. Others, while encoded by two bytes, can easily convert without loosing translation. It seems to me that the second two characters in the string above make up the UTF8 character I wish to convert.
Using UltraEdit-32 I was able to open my file and convert the encoding from UTF8 to Unicode. When I did this "R��sselsheim" became "R�sselsheim". Then converting from Unicode to ASCII, "R�sselsheim" remained "R�sselsheim".
Any ideas?
 
Jose Botella
Ranch Hand
Posts: 2120
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to the Ranch Dough,
for me this code worked ok

Though the first line was printed in netbeans, not in a DOS window.
[ May 01, 2004: Message edited by: Jose Botella ]
 
Doug Cyporyn
Greenhorn
Posts: 2
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
thanks for taking the time to reply. i came back to my code a while later and realized it was working. I needed to provide a String to accept the return from the replace method.
Sorry for the hassle.
Doug
 
Don't get me started about those stupid light bulbs.
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic