Win a copy of Think Java: How to Think Like a Computer Scientist this week in the Java in General forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Regex Help

 
Joel Christophel
Ranch Hand
Posts: 250
1
Chrome Eclipse IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Right now the following code removes everything that's not a letter from each index of the array. How would I alter the regex so that it allows letters and apostrophes ( ’ ) and hyphens ( - )?
 
Henry Wong
author
Marshal
Pie
Posts: 21115
78
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Joel Christophel wrote:Right now the following code removes everything that's not a letter from each index of the array. How would I alter the regex so that it allows letters and apostrophes ( ’ ) and hyphens ( - )?


Looks straightforward, and an incredibly simple regex -- What problem are you having? And what have you tried?

Henry
 
Winston Gutkowski
Bartender
Pie
Posts: 10417
63
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Joel Christophel wrote:Right now the following code removes everything that's not a letter from each index of the array. How would I alter the regex so that it allows letters and apostrophes ( ’ ) and hyphens ( - )?

Seems an odd request; your regex is specifically there to remove letters. I think it would be better to back up and explain precisely what you want it to do.

It might also be worth mentioning that the above regex will only work for English text. If you need it to work generically, you'll probably need to use either regex character classes or write a method of your own (my slight preference; regexes are great, but not for everything).

If you decide on the latter you might want to check out the "category" (is...) methods in java.lang.Character.

HIH

Winston
 
Joel Christophel
Ranch Hand
Posts: 250
1
Chrome Eclipse IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:I think it would be better to back up and explain precisely what you want it to do.


In each array index is text mixed with punctuation like and, , dog! , tree." , etc. I'd like that it remove every that's not a letter, but not including hyphens and apostrophes (which are integral parts of some words).

Winston Gutkowski wrote:It might also be worth mentioning that the above regex will only work for English text.

That's the intention.
 
Winston Gutkowski
Bartender
Pie
Posts: 10417
63
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Joel Christophel wrote:I'd like that it remove every that's not a letter, but not including hyphens and apostrophes (which are integral parts of some words).

In which case, add them inside the square brackets, eg:
"[^A-Za-z']"
Just make sure that a '-' is the last character inside the brackets (otherwise it will be interpreted as a range).

Winston
 
Joel Christophel
Ranch Hand
Posts: 250
1
Chrome Eclipse IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks!
 
Winston Gutkowski
Bartender
Pie
Posts: 10417
63
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Joel Christophel wrote:Thanks!

You're welcome.

Winston
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic