File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes word / sentence regex pattern ? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Java » Java in General
Bookmark "word / sentence regex pattern ?" Watch "word / sentence regex pattern ?" New topic
Author

word / sentence regex pattern ?

jay vas
Ranch Hand

Joined: Aug 30, 2005
Posts: 407
Hi guys : I am trying to parse words and sentences in a tokenizer.

Im using a hand coded system :



Any suggestions on a regular expression which is more comprehensive ?
I assume this problem has been solved before .


Amit ChaudhariC
Ranch Hand

Joined: Aug 06, 2009
Posts: 33
you can try out something like


Regards,
Amit
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 39103
    
  23
Two of those characters are metacharacters, but they appear to work in this context.
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19697
    
  20

Most meta characters loose their meaning inside character classes. Other meta characters change in meaning (^ is start of input outside, negating inside), others are introduced (- is nothing outside, inside it means range unless it's the first character).


SCJP 1.4 - SCJP 6 - SCWCD 5 - OCEEJBD 6
How To Ask Questions How To Answer Questions
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 39103
    
  23
. . . but I can never remember which is which.

You do realise you can use methods of the Character class like isWhitespace, jay vas?
 
GeeCON Prague 2014
 
subject: word / sentence regex pattern ?