| Author |
Regex Help
|
Joel Christophel
Ranch Hand
Joined: Apr 20, 2011
Posts: 119
|
|
Right now the following code removes everything that's not a letter from each index of the array. How would I alter the regex so that it allows letters and apostrophes ( ’ ) and hyphens ( - )?
|
 |
Henry Wong
author
Sheriff
Joined: Sep 28, 2004
Posts: 16695
|
|
Joel Christophel wrote:Right now the following code removes everything that's not a letter from each index of the array. How would I alter the regex so that it allows letters and apostrophes ( ’ ) and hyphens ( - )?
Looks straightforward, and an incredibly simple regex -- What problem are you having? And what have you tried?
Henry
|
Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
|
 |
Winston Gutkowski
Bartender
Joined: Mar 17, 2011
Posts: 4756
|
|
Joel Christophel wrote:Right now the following code removes everything that's not a letter from each index of the array. How would I alter the regex so that it allows letters and apostrophes ( ’ ) and hyphens ( - )?
Seems an odd request; your regex is specifically there to remove letters. I think it would be better to back up and explain precisely what you want it to do.
It might also be worth mentioning that the above regex will only work for English text. If you need it to work generically, you'll probably need to use either regex character classes or write a method of your own (my slight preference; regexes are great, but not for everything).
If you decide on the latter you might want to check out the "category" (is...) methods in java.lang.Character.
HIH
Winston
|
Isn't it funny how there's always time and money enough to do it WRONG?
|
 |
Joel Christophel
Ranch Hand
Joined: Apr 20, 2011
Posts: 119
|
|
Winston Gutkowski wrote:I think it would be better to back up and explain precisely what you want it to do.
In each array index is text mixed with punctuation like and, , dog! , tree." , etc. I'd like that it remove every that's not a letter, but not including hyphens and apostrophes (which are integral parts of some words).
Winston Gutkowski wrote:It might also be worth mentioning that the above regex will only work for English text.
That's the intention.
|
 |
Winston Gutkowski
Bartender
Joined: Mar 17, 2011
Posts: 4756
|
|
Joel Christophel wrote:I'd like that it remove every that's not a letter, but not including hyphens and apostrophes (which are integral parts of some words).
In which case, add them inside the square brackets, eg:
"[^A-Za-z']"
Just make sure that a '-' is the last character inside the brackets (otherwise it will be interpreted as a range).
Winston
|
 |
Joel Christophel
Ranch Hand
Joined: Apr 20, 2011
Posts: 119
|
|
|
Thanks!
|
 |
Winston Gutkowski
Bartender
Joined: Mar 17, 2011
Posts: 4756
|
|
Joel Christophel wrote:Thanks!
You're welcome.
Winston
|
 |
 |
|
|
subject: Regex Help
|
|
|