This week's giveaway is in the Android forum.
We're giving away four copies of Android Security Essentials Live Lessons and have Godfrey Nolan on-line!
See this thread for details.
The moose likes I/O and Streams and the fly likes Reading Tabs Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Android Security Essentials Live Lessons this week in the Android forum!
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "Reading Tabs" Watch "Reading Tabs" New topic
Author

Reading Tabs

Anthony Smith
Ranch Hand

Joined: Sep 10, 2001
Posts: 285
I got a text file that has the following
<TAB> is an actual TAB keystroke
US<TAB> United States USA
CA<TAB> Canada CAN

I just wanted to be able to access the 3 elements in each column so I did the following:
import java.io.*;

public class file

{


public static void main(String[] args)

{
File csv = new File("wl.txt");
try {
DataInputStream in = new DataInputStream(
new FileInputStream("wl.txt"));

DataOutputStream out = new DataOutputStream(
new FileOutputStream("w2.txt"));
char chr;


while (true) {

StringBuffer country_code = new StringBuffer(2);
while ((chr = in.readChar()) != '\t') {
country_code.append(chr);
System.out.println(chr);
}
System.out.println("CC: " + country_code);


StringBuffer country_name = new StringBuffer(20);
while ((chr = in.readChar()) != '\t') {
country_code.append(chr);
}
System.out.println("CN: " + country_name);

StringBuffer district = new StringBuffer(20);
char lineSep = System.getProperty("line.separator").charAt(0);

while ((chr = in.readChar()) != lineSep) {
district.append(chr);
}
System.out.println("D: " + district);
}
}

catch (EOFException e) {

System.out.println(e);
}
// System.

catch (Exception e) {

System.out.println(e);

}
}
}
*************
When I look at the following line, all I see is '?' System.out.println(chr);
What am I doign wrong?
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
The readChar() method of DataInputStream reads exactly two bytes and assumes that they are a Unicode representation of a character. The problem is, most text files aren't in Unicode - they're usually in your system's default encoding. On Windows in the Americas and Europe this is usually Cp-1252, which is Microsoft's version for latin-1 encoding (a variant of ASCII). It's a one-byte encoding - which means that the DataInputStream is grabbing two two characters in Cp-1252 and reinterpreting them as one Unicode char, which results in gibberish. Instead of DataInputStream, try a FileReader wrapped in a BufferedReader:

The FileReader takes char of translating the system default encoding into characters, and the BufferedReader takes care of reading one line at a time. What you do with each line you've read is up to you...


"I'm not back." - Bill Harding, Twister
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Reading Tabs
 
Similar Threads
File and FileDescriptor
New to Java, Simple I/O question
Some questions about writeBytes(String) vs writeChars(String)
'reading' an xml document - what am i doing wrong?
Another problem