wood burning stoves*
The moose likes I/O and Streams and the fly likes Tokenize numbers only, skip words Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login

Win a copy of Murach's Java Servlets and JSP this week in the Servlets forum!
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "Tokenize numbers only, skip words" Watch "Tokenize numbers only, skip words" New topic

Tokenize numbers only, skip words

Jerri Loh
Ranch Hand

Joined: Jul 06, 2010
Posts: 31
Hi there, does anybody know how should I tokenize only the numbers from the textfile and skip the words?

I have two approaches:

the first is using a token type switch case..

something like:

while(s.nextToken() != StringTokenizer.TT_EOF) //30 July 2010
switch (ttype) {
case StringTokenizer.TT_WORD:
System.out.println("Header and Title ignored");
//do not want it to be returned
s.skip();// don't know how to go about here..
case StringTokenizer.TT_NUMBER:
return;//not sure about this one

or the whitespaceChar method in the StreamTokenizer class.. something like:

public void whitespaceChar(int low, int high)

I am not sure how to go about. anybody able to help me out here??
Jim Size

Joined: Aug 10, 2010
Posts: 29
hello there, i hope that i understand your problem. You need to acquire from a sentence the numbers only by using the Tokenizer.

so a piece of the code i believe its correct:

StringTokenizer st = new StringTokenizer("This is 12 an example"); //adding a random number into the String sentence.

while ( st.hasMoreTokens() ){

String s = st.nextToken();
int i = Integer.parseInt(s);
System.out.println(i);} // end while end example code

//this is my first time i reply into a problem and i don't know how to use the "code" thing, sorry guys.

so if - s - is an integer, - i - takes its value and then it gets printed out (you can do whatever you want with it )
i didn't compile it into eclipse or bluej because i am pretty sure it works.
you have to import the packages again
sorry for my bad english
Jerri Loh
Ranch Hand

Joined: Jul 06, 2010
Posts: 31
hi. i actually solved it.

Jim Size

Joined: Aug 10, 2010
Posts: 29
welldone for solving your problem!
maybe i didn't get it.
your program tokenizes everything but as i can see from the code you take the info on 7 and 8 position. I mean the 7th and 8th word in the txt file.
And you adding it into another doc.
I thought that you need only the numbers from a text, not specific the 7th and 8th words of it.

Jerri Loh
Ranch Hand

Joined: Jul 06, 2010
Posts: 31
But it wasnt entirely me. Someone from Daniweb, Tong1, helped me with a fragment.Thank you J Sizeas. I really appreciate your help. See you around.
Satya Maheshwari
Ranch Hand

Joined: Jan 01, 2007
Posts: 368
You could also use pattern matching to find this pattern : \\s\\d\\d*\\s i.e. a digit followed by any number of digits with white space on either side. You can modify a bit per your requirement.

Thanks and Regards
Rob Spoor

Joined: Oct 27, 2005
Posts: 19651

Of course \\d\\d* is equivalent to \\d+. Both mean 1 or more digits. And \\s may be too restrictive. How about dots, commas, other characters?

How To Ask Questions How To Answer Questions
Jerri Loh
Ranch Hand

Joined: Jul 06, 2010
Posts: 31
I could not have used pattern matching because each line of the pdb text file was in a different format and contained unnecessary information as well.
Shanky Sohar
Ranch Hand

Joined: Mar 17, 2010
Posts: 1051

using regex or scanner may also be a solution when you want to do some tokenizing.

SCJP6.0,My blog Ranchers from Delhi
wood burning stoves
subject: Tokenize numbers only, skip words
Similar Threads
[Solved] toString() issues
Graphic by date
switch: how to use one "case" for multiple/range option
c++ to Java
Case Statement Problem