aspose file tools*
The moose likes Java in General and the fly likes Tokens Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Tokens" Watch "Tokens" New topic
Author

Tokens

Kevin Luludis
Ranch Hand

Joined: Feb 13, 2001
Posts: 30
What would be the best way to read in any text based file. And then be able to pass through the file as an array, and splitting up the text into tokens.

I want to be able to take whatever the 'Token' is and then check it to see exactly what the token is specifically. Is it an Id, expression, term, factor, etc.
Thanks a lot
Stephanie Grasson
Ranch Hand

Joined: Jun 14, 2000
Posts: 347
Kevin,
Could you use something like this:

Stephanie
Kevin Luludis
Ranch Hand

Joined: Feb 13, 2001
Posts: 30
WOW! Stephanie that was great! Thank you very much. I just wanted to tell you that before I went to lunch. I have some more questions maybe you could help me with on this matter. Thanks again!
Kevin Luludis
Ranch Hand

Joined: Feb 13, 2001
Posts: 30
I see how this is done by declaring what I want output as the String s. But how would I do this using File I/O?
Stephanie Grasson
Ranch Hand

Joined: Jun 14, 2000
Posts: 347
Kevin,
You could try:

This assumes that you have a file in the current directory called fruit.txt that looks like this:

Apples := Bannana + 4371 - Pears DIV Apples
Bananna := Pears + 1734 - Apples DIV Bananna
Pears := Apples + 3471 - Bananna DIV Pears

Hope this helps.
Stephanie
Kevin Luludis
Ranch Hand

Joined: Feb 13, 2001
Posts: 30
Thanks again. You have been a tremendous help. Does StringTokenizer automatically ignore whitespace? It recognized EOL fine. But trouble with the whitespace. Im just asking because I know there is StremTokenizer, but it looks like StringTokenizer handles this much better!
Stephanie Grasson
Ranch Hand

Joined: Jun 14, 2000
Posts: 347
Kevin,
I believe how whitespace is handled depends on how you set up your StringTokenizer.
I used the constructor
StringTokenizer( String s );
which works as follows:

Constructs a string tokenizer for the specified string. The tokenizer uses the default delimiter set, which is "\t\n\r\f": the space character, the tab character, the newline character, the carriage-return character, and the form-feed character. Delimiter characters themselves will not be treated as tokens.

If this is not what you had in mind, you could check out one of the other StringTokenizer constructors.
Sorry, I've never used StremTokenizer, so I don't know about it.
Stephanie
Kevin Luludis
Ranch Hand

Joined: Feb 13, 2001
Posts: 30
This is exactly what I had in mind, but If i was to read in some data that accidentally was missing a space the Tokens would be then joined together. Not exactly what I wanted but supper close!! What would I have to change in order for it to not only handle tabs, new lines, and returns, but also spaces?
Angela Lamb
Ranch Hand

Joined: Feb 22, 2001
Posts: 156
The StreamTokenizer is very similar to the StringTokenizer, except that it takes either an InputStream or a Reader in the constructor instead of a String. It also has some very nice methods for determining whether the token is a word or a line (so it will separate by both spaces and line breaks). It even tells you what the line number for each token is. Here's a link to the docs:

http://java.sun.com/j2se/1.3/docs/api/java/io/StreamTokenizer.html
Kevin Luludis
Ranch Hand

Joined: Feb 13, 2001
Posts: 30
I had tried that, not giving me any real good results. Any suggestions for optimizing this?

So far all i got it to do was tell me whether the token was a word or a number, and then give me its corresponding value. It did ignore the whitespace which was good, but I would love to get the results that the StringTokenizer gave above.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Tokens