This week's book giveaway is in the Mac OS forum.
We're giving away four copies of a choice of "Take Control of Upgrading to Yosemite" or "Take Control of Automating Your Mac" and have Joe Kissell on-line!
See this thread for details.
The moose likes Beginning Java and the fly likes Using StringTokenizer? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Java » Beginning Java
Bookmark "Using StringTokenizer? " Watch "Using StringTokenizer? " New topic
Author

Using StringTokenizer?

Jack Sloan
Greenhorn

Joined: Mar 23, 2011
Posts: 11
Hey guys I have been programming a project, to count the number of words in a given text file.

I am using the StringTokenizer with specified delimiters.

The problem I am running into is that my count is displaying zero. I am not sure where my error lies.

If someone could point out the error, I would greatly appreciate it. Thank you










Lincoln's Gettysburg Address, given November 19, 1863
on the battlefield near Gettysburg, Pennsylvania, USA


Four score and seven years ago, our fathers brought forth
upon this continent a new nation: conceived in liberty, and
dedicated to the proposition that all men are created equal.

Now we are engaged in a great civil war. . .testing whether
that nation, or any nation so conceived and so dedicated. . .
can long endure. We are met on a great battlefield of that war.

We have come to dedicate a portion of that field as a final resting place
for those who here gave their lives that this nation might live.
It is altogether fitting and proper that we should do this.

But, in a larger sense, we cannot dedicate. . .we cannot consecrate. . .
we cannot hallow this ground. The brave men, living and dead,
who struggled here have consecrated it, far above our poor power
to add or detract. The world will little note, nor long remember,
what we say here, but it can never forget what they did here.

It is for us the living, rather, to be dedicated here to the unfinished
work which they who fought here have thus far so nobly advanced.
It is rather for us to be here dedicated to the great task remaining
before us. . .that from these honored dead we take increased devotion
to that cause for which they gave the last full measure of devotion. . .
that we here highly resolve that these dead shall not have died in vain. . .
that this nation, under God, shall have a new birth of freedom. . .
and that government of the people. . .by the people. . .for the people. . .
shall not perish from this earth.
Darryl Burke
Bartender

Joined: May 03, 2008
Posts: 4571
    
    5

Have you read the API for StringTokenizer?
StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.


luck, db
There are no new questions, but there may be new answers.
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 39053
    
  23
. . . and welcome to the Ranch
Edwin Torres
Ranch Hand

Joined: Mar 19, 2011
Posts: 55


It doesn't make sense to read in a new line here, when you haven't finished counting the words for the previous line. You should process one line (count the words), then move onto the next line.

Like Darryl pointed out, don't use StringTokenizer. Instead use regular expressions. For example:

You'll have to come up with the pattern for your app though.

Edit: fixed my regular expression. I think that should do it. Thanks Campbell for pointing that out.


Twitter: @realEdwinTorres
Blog: java Friendly
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 39053
    
  23
You would appear to be passing a regular expression (regex); I didn't know StringTokenizer even took a regex. String#split always takes a regex. You also only seem to be reading the line once only.
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 39053
    
  23
I would have thought you can use the standard regex for whitespace. You may need to change \ to \\ or even \\\\.
Jack Sloan
Greenhorn

Joined: Mar 23, 2011
Posts: 11
Ok, thank you for your responses.
I'm not sure I follow completely but I will go re read the API on the StringTokenizer as well as changing my code.
I will post back here when I have sorted out the answer.
Thank you again for the swift replies

EDIT: Sorry didn't see where you said to not use StringTokenizer, the problem with this is that I am required to use the StringTokenizer for the project
Jack Sloan
Greenhorn

Joined: Mar 23, 2011
Posts: 11
Ok I have revised my code, but now the problem is it is reading just one single line?
How can I make it go through the iteration of every line and then total display the number of words and chars?

Ralph Cook
Ranch Hand

Joined: May 29, 2005
Posts: 479
I don't see a loop to read through lines, only one to go through tokens.

You need two loops, one inside the other:

as your code is now, you read a line, then you read a second line and hand it to the tokenizer. You then enter a loop to process tokens; after each of those tokens, you read another line which you do not process. If the second line had more tokens than you had lines in the file, the program would fail trying to read lines that it didn't have. After the token loop, you read another line.

While StringTokenizer is certainly old, I disagree with those who say you "should" use regular expressions. It seems to me a bit too simple a job to have to study regular expressions just to get it done. And, as you said, you're required to use this class.

Jack Sloan
Greenhorn

Joined: Mar 23, 2011
Posts: 11
Thank you for your reply, I'm just unsure of how to implement the algorithm you provided within my program.
Ralph Cook
Ranch Hand

Joined: May 29, 2005
Posts: 479
BufferedReader.readLine returns a null if the end of the stream has been reached (http://download.oracle.com/javase/6/docs/api/). Therefore you can set up a loop to read through the lines in the file with:

This code will dump all the lines onto System.out, assuming you have opened the file correctly.

StringTokenizer has a method to tell you whether to continue (or start) your loop.



So this could go in place of the "System.out.println" as the way you process the lines you read; this loop, in other words, goes within that loop.

I'm trying not to write the program for you -- is this enough for hints?

rc
Jack Sloan
Greenhorn

Joined: Mar 23, 2011
Posts: 11
Ok following along right here



Throws my output into an infinite loop displaying the first line of my testfile.txt over and over again.
How am I reading the file wrong?

Thank you for your replies and now just doing it for me, I appreciate you taking the time to actually teach me.
Ralph Cook
Ranch Hand

Joined: May 29, 2005
Posts: 479
oh, well, that's a BUG.

after the println, do another line = myReader.readline(). within the brackets.

rc
Jack Sloan
Greenhorn

Joined: Mar 23, 2011
Posts: 11
OK Fixed that and have gotten it to display it correctly.
Now for the second part I am still confused
Again thank you very much
Jack Sloan
Greenhorn

Joined: Mar 23, 2011
Posts: 11
Ok I may have answered m y own question by taking a different route.



I do it this way and I get the proper word count, it is just displayed for every line. How can I just get the final result?
Ralph Cook
Ranch Hand

Joined: May 29, 2005
Posts: 479
Well done -- you have your own loop-within-a-loop. The outer loop is defined by "while(inputFile.hasNextLine())" and its braces, and the inner loop by "while(tokenizer.hasMoretokens())" and its braces.

Your printout statement "System.out.println(wordcount);" comes outside the inner loop, but within the outer loop. If you only want the total after the outer loop has run, then put that statement outside the outer loop braces. In other words, everything within those braces is going to be executed once per line; below those braces will only executed after all lines are processed.

(an only-vaguely-related tip: *I* find it helpful to line beginning and ending braces up; this is not universal, and in fact it is common for people to put the beginning brace at the end of its line and the ending brace indented to the same space as code within the braces. But I find it very helpful to line up the braces, and I would think it would be doubly helpful to someone just learning their use.)

rc
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 39053
    
  23
Ralph Cook wrote: . . .
This code will dump all the lines onto System.out, . . .
Surely that will read one line and print it repeatedly until the computer breaks down

More likely
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 39053
    
  23
Jack Sloan wrote: . . . I am required to use the StringTokenizer for the project
It is worthwhile quietly gently and tactfully asking why you are required to use legacy code. And the bit about "discouraged in new code" is about halfway down the description of StringTokenizer, after the class name bit and before the constructor summary. Darryl Burke has already quoted it. You will also find a recommendation about what to use instead in the same place.
 
GeeCON Prague 2014
 
subject: Using StringTokenizer?