I need to read in a record and replace a character in the record with another character. The file I am reading is encoded in UTF8 format. In Java, I can read the file and specify the encoding that is used.
//specify file and create input stream with proper encoding File f = new File("c:\\gme_test.txt"); FileInputStream file = new FileInputStream(f); InputStreamReader inReader = new InputStreamReader(file, "UTF8"); It is my understanding that when Java reads the file it will convert it to Unicode. So, when I do this: //read a record from the file into a String object BufferedReader inBuffer = new BufferedReader(inReader); String aRecord = inBuffer.readLine(); The result should be a Unicode String. Now I should be able to convert a character in the string. I am using the Unicode literal for the character as follows:
//replace � with u aRecord.replace('\u00FC', 'u'); The problem is that the character I am trying to replace is not found in the String. The string looks more like this: R��sselsheim Reading about UTF8, I have learned that some characters can only represented by two bytes. Others, while encoded by two bytes, can easily convert without loosing translation. It seems to me that the second two characters in the string above make up the UTF8 character I wish to convert. Using UltraEdit-32 I was able to open my file and convert the encoding from UTF8 to Unicode. When I did this "R��sselsheim" became "R�sselsheim". Then converting from Unicode to ASCII, "R�sselsheim" remained "R�sselsheim". Any ideas?
Welcome to the Ranch Dough, for me this code worked ok
Though the first line was printed in netbeans, not in a DOS window. [ May 01, 2004: Message edited by: Jose Botella ]
SCJP2. Please Indent your code using UBB Code
Joined: Jun 03, 2003
thanks for taking the time to reply. i came back to my code a while later and realized it was working. I needed to provide a String to accept the return from the replace method. Sorry for the hassle. Doug