• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Stream Tokenizer - Special case

 
Ranch Hand
Posts: 31
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,
In my application we are using StreamTokenizer to
read a file. We are using the constructor
StreamTokenizer(Reader r).
The file is in comma separated format. OUr requirement is to split the data into tokens using comma(,) as the whitespace character. All other characters are made as ordinary characters
using the method resetSyntax(). Then we use the next Token method in a loop to extract the data.
But the problem arises when the data in the file
is like this:
40,,44,56,,,78,,,,
Note that there is nothing between the commas.
In this case the nextToken method just skips these empty data cells. But we need to identify these empty or null data. Please help with any suggestions. I am not sure if StreamTokenizer is going to help.
The code :->
public class MyParser {
private StreamTokenizer parser;

public MyParser() {
super();
}
public MyParser(Reader in) {
super();
if (!(in instanceof BufferedReader)) {
BufferedReader br = new BufferedReader(in);
parser = new StreamTokenizer(br);
}
else parser = new StreamTokenizer(in);
init();
}

public String[] getRow() throws IOException {
List results = new ArrayList();
if ((parser.ttype == 0) || (parser.ttype == parser.TT_EOL)) parser.nextToken();

while ((parser.ttype != parser.TT_EOL) && (parser.ttype != parser.TT_EOF)){


if (parser.sval != null) results.add(parser.sval);
parser.nextToken();



}
String[] retArray = new String[results.size()];
for (int i = 0; i < results.size(); i++){
retArray[i] = (String) results.get(i);
}
return (retArray);
}
public boolean hasMoreRows(){
return !(parser.ttype == parser.TT_EOF);
}
private void init() {

parser.resetSyntax();
parser.whitespaceChars(',', ',');
parser.quoteChar('"');
parser.wordChars(' ',' ');
parser.wordChars(32,43);
parser.wordChars(45,255);
parser.eolIsSignificant(true);

}
public static void main(String[] args) {
try {

URL url = new URL("ftp//myserver");
url.openConnection();
InputStream ftp = url.openStream();
InputStreamReader ftpr = new InputStreamReader(ftp);
MyParser my = new MyParser(ftpr);
java.io.LineNumberReader ln = new java.io.LineNumberReader(ftpr);

do {

String[] st = my.getRow();
System.out.println("START NEW ROW"+ " " + st.length);
for (int i =0; i < st.length; i++){
System.out.println(st[i] + " " + st[i].length());

}

} while (my.hasMoreRows());
}
catch (Throwable t){
t.printStackTrace();
}


}empty or null
 
Sheriff
Posts: 7023
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What about something simple...
Before parsing the text with StreamTokenizer, go through it and add a space (or anything you'd like) between two consecutive commas.
 
Srihari Injeti
Ranch Hand
Posts: 31
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Dirk,
How do I go about doing this. Can you give me an example. How can I parse through the stuff. Where can I store the output stream after I parse it. I cannot store it on the disk and read back from it to parse it again for a second time.
I appreciate your help
Thanks
Sri

Originally posted by Dirk Schreckmann:
What about something simple...
Before parsing the text with StreamTokenizer, go through it and add a space (or anything you'd like) between two consecutive commas.

 
Srihari Injeti
Ranch Hand
Posts: 31
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Does any one know if there is Open Souce code for CSV parsers.
Your help is very much appreciated.
Thanks
Sri
 
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,
I gave up using StreamTokenizer as it's so quirky.
Couldn't you use StringTokenizer instead, then you can write your own methods to handle your specific data coming through.
Hope this helps.
Terence
 
Dirk Schreckmann
Sheriff
Posts: 7023
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You could use a StringBuffer to hold your text data while you are manipulating it. Here's a simple for loop that would do the insertion of a space character between any two consecutive commas:

Now, I have a philosophy paper due, will you write it for me?
 
Ranch Hand
Posts: 125
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I wrote an open source comma separated value parser:
http://ostermiller.org/utils/CSVLexer.html
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic