• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Tokenizer

 
John Reacher
Greenhorn
Posts: 18
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi everyone,

I am trying to use string tokenizer to read in each line from a file, where elements are separated by a ";" and tokenize them. For example:

1000;The Times;40.58;hello

Would be 4 tokens. The problem I'm having is that the file contains lines with 5 tokens instead of 4. Once I'm done tokenizing, I'm sending these tokens as attributes to a constructor to create an object. The elements in the file that I need to tokenize represent 3 different types of object, all subclasses of one superclass.

Is there any way to distinguish the two types of object that have 5 tokens from the third type of object, which has 4? And can I use one tokenizer and arraylist to store the created objects? Or will it have to be 3?

I'm new to Java, please keep in mind. Thank you!
 
Winston Gutkowski
Bartender
Pie
Posts: 9497
50
Eclipse IDE Hibernate Ubuntu
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jack Reacher wrote:Is there any way to distinguish the two types of object that have 5 tokens from the third type of object, which has 4?

Sure. Have a look at the countTokens() method.

And can I use one tokenizer and arraylist to store the created objects? Or will it have to be 3?

I'm not quite sure what you're asking here: 3 tokenizers or 3 Lists?

You should only need 1 of each, provided the List is defined as List<SuperClass>; but whether it's what you want is another question entrely.

It may also be worth mentioning that StringTokenizer is a legacy class, and its use is not generally recommended.
This, from the API docs:
StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.

Winston
 
John Reacher
Greenhorn
Posts: 18
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm just struggling to see how I could make this into a single tokenizer, given two of the object types have 6 tokens.

I could say:

If (countTokens =4)
{
.....tokenize this way...
}
Else
{
Tokenize this way....

But if there are two different subclass object types, both with 5 tokens, how do I distinguish between those?
 
Winston Gutkowski
Bartender
Pie
Posts: 9497
50
Eclipse IDE Hibernate Ubuntu
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jack Reacher wrote:But if there are two different subclass object types, both with 5 tokens, how do I distinguish between those?

Simply put: I have no idea. Is there something about the data itself that might distinguish them (eg, the first token is numeric, rather than alphabetic)?

If not, I suspect you won't be able to do what you want.

And again, if this is for school and you've been told to use StringTokenizer, you should do as they ask. If not, use something else (eg, String.split()).

Winston
 
John Reacher
Greenhorn
Posts: 18
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for the advice so far. Each subclass object has a unique ID of 4 numbers, and the final digit determines which type of object it is. I tried this way, but the console just prints blankly and I can't see why:

public void loadPlayLists()
{
try
{
File file = new File (PLAYLIST_FILE_NAME);
Scanner in = new Scanner (file);

while (in.hasNext())
{
String line = in.nextLine();
StringTokenizer st = new StringTokenizer (line,";");

String id = st.nextToken();
char c = id.charAt(3);
if (c == 0 || c == 1 || c == 2)
{
String category = st.nextToken();
String playTime = st.nextToken();
String audioFile = st.nextToken();
String showTitle = st.nextToken();
String hostName = st.nextToken();

TalkShow tk = new TalkShow (id, category, playTime, audioFile, showTitle,
hostName);
playList.add(tk);
}

if (c == 3 || c == 4 || c == 5 || c == 6 || c == 7)
{
String category = st.nextToken();
String playTime = st.nextToken();
String audioFile = st.nextToken();
String songName = st.nextToken();
String artistGroup = st.nextToken();
Song s = new Song (id, category, playTime, audioFile, songName,
artistGroup);
playList.add(s);
}

if (c == 8 || c == 9)
{
String category = st.nextToken();
String playTime = st.nextToken();
String audioFile = st.nextToken();
String companyName = st.nextToken();
Commercial cm = new Commercial (id, category, playTime, audioFile,
companyName);
playList.add(cm);
}
}
in.close();
}
catch (FileNotFoundException fnfe)
{
fnfe.printStackTrace();
}
}

I'm running a Main method, which creates the Manager object and then runs the method to print the arraylist items. Any insight you could provide would be most helpful. Thanks!
 
John Reacher
Greenhorn
Posts: 18
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Nevermind, I figured this out and its working now. Thanks for the earlier advice though!
 
Campbell Ritchie
Sheriff
Pie
Posts: 47300
52
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jack Reacher wrote: . . . Each subclass object has a unique ID of 4 numbers, and the final digit determines which type of object it is. . . .
So, is this some sort of exercise in how to deal with somebody else’s useless data structures? That sounds very “brittle”; if you change the structure of that number slightly, you will end up putting all the remainder of the line in the wrong type. If you are supposed to write a report about your assignment (as well as the code) I would suggest you mention that problem. And take note of both bits of advice you have been give about tokenisers: you have to do what the assignment says, but you should also query the use of very old‑fashioned code.

And well done sorting out your problem (); please tell us how you did it.
 
I agree. Here's the link: http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic