wood burning stoves 2.0*
The moose likes Java in General and the fly likes Regular expression Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Regular expression" Watch "Regular expression" New topic

Regular expression

Carlos Bonzilla

Joined: May 03, 2011
Posts: 17
I have a field where the users are allowed to enter comments on my web-page. The characters allowed to enter are . For instance:
1.I am allowed is ok
2.I am not allowed # is not ok
3.ÅÄÖåäö-,.:'éüáèç%()@ is ok.

Any suggestion for the regular expression that will solve this ?
Best regards
Darryl Burke

Joined: May 03, 2008
Posts: 4523

Here are a couple of learning resources for regex:

And of course there's the java.util.regex.Pattern API.

Show your best efforts, in the form of an SSCCE and someone will help you do the fine-tuning if needed.

luck, db
There are no new questions, but there may be new answers.
Ryan Beckett
Ranch Hand

Joined: Feb 22, 2009
Posts: 192

Since I've just given you the answer, at least let me explain it, so you can learn how I did it.

Start off by reviewing the literature in the Regex API linked above. It's a good reference, but if you've never done regular expressions, check out the tutorials first.


This is the range of the specific Latin unicode characters you expect to be in the input. Simple enough. See the Latin Unicode chart for details.


This means match (or allow) any word character (0-9, A-Z, or a-z)


Allow whitespace characters.


Allow any punctuation character.


Allow all of previously declared characters "and not" this one. Whatever punctuation you don't want needs to be included inside the brackets.


This is a greedy quantifier. It says to allow "one or more of all of these characters" in the string. Note that the regular expression must be enclosed in brackets when applying the quantifier.

Also, Note that all of these specifiers are escaped because they're within strings. Hope that helps. Good luck.
Carlos Bonzilla

Joined: May 03, 2011
Posts: 17
Ryan Beckett wrote:

See Latin Unicode.

Thanks for your help Ryan. I think some more characters needs to be excluded. For instance, the string Hey how are u$[*? passed the test although it shouldn't.

Best regards
Ryan Beckett
Ranch Hand

Joined: Feb 22, 2009
Posts: 192
Try this.

Carlos Bonzilla

Joined: May 03, 2011
Posts: 17
Ryan Beckett wrote:Try this.

Thanks for your explanation Ryan. I am very new to regular expressions so your links will be read for sure

Best regards
Rob Spoor

Joined: Oct 27, 2005
Posts: 19651

I'd probably use \\p{L} and \\d instead of \u00C0-\u00FF and \\w; \\p{L} includes a-z and A-Z, so \\w can be replaced by \\d. \\p{L} also includes all Unicode letters, including some of the more exotic ones (Spanish, Scandinavian, etc).

How To Ask Questions How To Answer Questions
I agree. Here's the link: http://aspose.com/file-tools
subject: Regular expression
Similar Threads
regex for nameFields: first & last names tested separately
regular expression problems
Regular Expression
Regular Expression issue
Find String Within a String