• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

delimeters and regex

 
ilteris kaplan
Ranch Hand
Posts: 38
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello I am trying to build an algorithm where I will convert every word that is coming from an incoming file to a String and compare its characters with some phrase accordingly. I managed to get the words separately if use StringTokenizer class as it allows me to use delimiters=".,':;?{}[]=-+_!@#$%^&*() " as an argument but they are coming as String reference of the same object, I am trying to keep them in an array. So I need something to do with String.split(".,':;?{}[]=-+_!@#$%^&*() ") which is not working.

Here is my code
 
Ulf Dittmer
Rancher
Posts: 42968
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello "ilteris"-

Welcome to JavaRanch.

On your way in you may have missed that there is a policy on screen names here, and that your does not conform to it. Please read it, and adjust it accordingly, which you can do right here. Thanks for your prompt attention to this matter.

Enjoy your time here.
 
Ulf Dittmer
Rancher
Posts: 42968
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm not quite sure what you're trying to do, but it looks like instead of

String[] word = content.split(".,':;?{}[]=-+_!@#$%^&*() ");

you should use

String[] word = tokenizer.nextToken().split(".,':;?{}[]=-+_!@#$%^&*() ");
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Mmmmm... unfortunately you've chosen a problem that is difficult to do with split, but should be very easy with a StringTokenizer. The simple answer here is: forget about split(), as that requires you to understand regular expressions, which is a special type of string with its own rules, described in detail at the link I just gave. Just use the methods described in StringTokenizer, like nextToken().

[ January 21, 2006: Message edited by: Jim Yingst ]
 
Garrett Rowe
Ranch Hand
Posts: 1296
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I don't know, this seems like an occasion where regular expressions should be pretty straightforward. Like maybe,

instead of using the StringTokenizer class at all.

From StringTokenizer documentation:
StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.

[ January 21, 2006: Message edited by: Garrett Rowe ]
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, there are regexes here that are straightforward, if you know how regexes work. However it's also possible to go very badly awry. To understand all the things wrong with the initial attempt & how to fix them takes a bit of explanation, which I decided to avoid. The StringTokenizer was pretty simple too, until split() was brought into the picture rather than simply calling nextToken(). I like split() a lot - but when ued by people who don't know regular expressions, it can be a minefield. Especially when there are a lot of special characters being mentioned.
 
ilteris kaplan
Ranch Hand
Posts: 38
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
gotcha! It looks like I need to train myself with regex first!
Thanks for all the feedback guys, this board is great!

best
ilteris.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic