File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Java in General and the fly likes word / sentence regex pattern ? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login

Win a copy of EJB 3 in Action this week in the EJB and other Java EE Technologies forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "word / sentence regex pattern ?" Watch "word / sentence regex pattern ?" New topic

word / sentence regex pattern ?

jay vas
Ranch Hand

Joined: Aug 30, 2005
Posts: 407
Hi guys : I am trying to parse words and sentences in a tokenizer.

Im using a hand coded system :

Any suggestions on a regular expression which is more comprehensive ?
I assume this problem has been solved before .

Amit ChaudhariC
Ranch Hand

Joined: Aug 06, 2009
Posts: 33
you can try out something like

Campbell Ritchie

Joined: Oct 13, 2005
Posts: 36486
Two of those characters are metacharacters, but they appear to work in this context.
Rob Spoor

Joined: Oct 27, 2005
Posts: 19543

Most meta characters loose their meaning inside character classes. Other meta characters change in meaning (^ is start of input outside, negating inside), others are introduced (- is nothing outside, inside it means range unless it's the first character).

How To Ask Questions How To Answer Questions
Campbell Ritchie

Joined: Oct 13, 2005
Posts: 36486
. . . but I can never remember which is which.

You do realise you can use methods of the Character class like isWhitespace, jay vas?
I agree. Here's the link:
subject: word / sentence regex pattern ?
Similar Threads
text input - alphabets
palindrome checker, tried to debug, keeps looping why?
ignoring space in an anagram solver
Duplicate Characters
comparing an element in one array with all the elements in another array