aspose file tools*
The moose likes Beginning Java and the fly likes delimeters and regex Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "delimeters and regex" Watch "delimeters and regex" New topic
Author

delimeters and regex

ilteris kaplan
Ranch Hand

Joined: Jan 21, 2006
Posts: 38
Hello I am trying to build an algorithm where I will convert every word that is coming from an incoming file to a String and compare its characters with some phrase accordingly. I managed to get the words separately if use StringTokenizer class as it allows me to use delimiters=".,':;?{}[]=-+_!@#$%^&*() " as an argument but they are coming as String reference of the same object, I am trying to keep them in an array. So I need something to do with String.split(".,':;?{}[]=-+_!@#$%^&*() ") which is not working.

Here is my code
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 39551
    
  27
Hello "ilteris"-

Welcome to JavaRanch.

On your way in you may have missed that there is a policy on screen names here, and that your does not conform to it. Please read it, and adjust it accordingly, which you can do right here. Thanks for your prompt attention to this matter.

Enjoy your time here.


Ping & DNS - updated with new look and Ping home screen widget
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 39551
    
  27
I'm not quite sure what you're trying to do, but it looks like instead of

String[] word = content.split(".,':;?{}[]=-+_!@#$%^&*() ");

you should use

String[] word = tokenizer.nextToken().split(".,':;?{}[]=-+_!@#$%^&*() ");
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Mmmmm... unfortunately you've chosen a problem that is difficult to do with split, but should be very easy with a StringTokenizer. The simple answer here is: forget about split(), as that requires you to understand regular expressions, which is a special type of string with its own rules, described in detail at the link I just gave. Just use the methods described in StringTokenizer, like nextToken().

[ January 21, 2006: Message edited by: Jim Yingst ]

"I'm not back." - Bill Harding, Twister
Garrett Rowe
Ranch Hand

Joined: Jan 17, 2006
Posts: 1296
I don't know, this seems like an occasion where regular expressions should be pretty straightforward. Like maybe,

instead of using the StringTokenizer class at all.

From StringTokenizer documentation:
StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.

[ January 21, 2006: Message edited by: Garrett Rowe ]

Some problems are so complex that you have to be highly intelligent and well informed just to be undecided about them. - Laurence J. Peter
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Well, there are regexes here that are straightforward, if you know how regexes work. However it's also possible to go very badly awry. To understand all the things wrong with the initial attempt & how to fix them takes a bit of explanation, which I decided to avoid. The StringTokenizer was pretty simple too, until split() was brought into the picture rather than simply calling nextToken(). I like split() a lot - but when ued by people who don't know regular expressions, it can be a minefield. Especially when there are a lot of special characters being mentioned.
ilteris kaplan
Ranch Hand

Joined: Jan 21, 2006
Posts: 38
gotcha! It looks like I need to train myself with regex first!
Thanks for all the feedback guys, this board is great!

best
ilteris.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: delimeters and regex
 
Similar Threads
Still Stacks
criteriaFind
What's wrong with the program?
sort different file length records
StringTokenizer, countTokens, nextToken???