File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes Regular Expression Question Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of EJB 3 in Action this week in the EJB and other Java EE Technologies forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Regular Expression Question" Watch "Regular Expression Question" New topic
Author

Regular Expression Question

Raymond Van Eperen
Greenhorn

Joined: Oct 07, 2008
Posts: 17

Can someone tell me what I need to change in the following regular expression so that the first digit of the month and day is optional, but if present, it must be one of the specified values.

In the code below, the first and second invalid values ("21/16/1908", and "11/42/1908") pass the test, but shouldn't. I know there are a bunch of date related regular expressions out there I could borrow, but I want to learn by understanding what is wrong with my expression.

The expression is for dates in mm/dd/yyyy format where the first digit of the day and month is optional and the first two digits of the year is optional.

Peter Lawrey
Ranch Hand

Joined: Dec 21, 2008
Posts: 62
Your pattern is more complicated than it needs to be and is perhaps confusing you.
try
"(0[1-9]|1[0-2])[/-_\\.](0[1-9]|[1-2][0-9]|3[0-1])[/-_\\.]((19|20)?[0-9][0-9])"
[ December 26, 2008: Message edited by: Peter Lawrey ]
Raymond Van Eperen
Greenhorn

Joined: Oct 07, 2008
Posts: 17

This doesn't work. It fails on "01-16-2008", so probably doesn't like the dash for the delimiter.
Bauke Scholtz
Ranch Hand

Joined: Oct 08, 2006
Posts: 2458
Regexp isn't the right tool to validate the actual date values. How about leap years and so on?

Just use java.util.Calendar and make it non-lenient.

E.g.
If IllegalArgumentException is been thrown, the date is invalid.
If you're using Strings, you can also use SimpleDateFormat#setLenient() for that.

Also see those utility class examples:
http://balusc.blogspot.com/2007/09/calendarutil.html
http://balusc.blogspot.com/2007/09/dateutil.html
[ December 26, 2008: Message edited by: Bauke Scholtz ]
Sunil Kumar
Ranch Hand

Joined: Apr 24, 2007
Posts: 76
I agree with Bauke.

For the regex provided by Peter. "-"(hyphen) carries a special meaning in java regex, i.e. a range. So you need to escape the - using \(forward slash) (For any reserved character you need the same)
that would mean if you want to use hyphen as a delimiter just as you used for "." (dot)
So you regex will become


Sunil Kumar
http://goodtoknowit.blogspot.com/
Raymond Van Eperen
Greenhorn

Joined: Oct 07, 2008
Posts: 17

Thanks to each of you for your response. I also agree that there are better ways to validate a date than using a regex. I am only doing this as a refresher in regex for my own benefit, not to use it in production code. So, to continue the line of questioning, the regex now doesn't work with "1/16/1908". It must not consider the first digit of the month to be optional. I thought that was done by putting * after the grouping.
Bauke Scholtz
Ranch Hand

Joined: Oct 08, 2006
Posts: 2458
http://www.regular-expressions.info/dates.html

You will still be in trouble with the right amount of days per month.
[ December 26, 2008: Message edited by: Bauke Scholtz ]
Raymond Van Eperen
Greenhorn

Joined: Oct 07, 2008
Posts: 17

I figured it out. The month needed to be: ([0-9]|0[1-9]|1[0-2]), and the day needed to be: ([0-9]|0[1-9]|[1-2][0-9]|3[0-1])

All the tests work now. I realize this really doesn't ensure a proper date, I was just using this example to brush up on how to write regular expressions. I don't use them much on the job, but you never know when you might need to. Here is the complete expression:
Piet Verdriet
Ranch Hand

Joined: Feb 25, 2006
Posts: 266
Originally posted by Sunil Kumar:
I agree with Bauke.

For the regex provided by Peter. "-"(hyphen) carries a special meaning in java regex, i.e. a range. So you need to escape the - using \(forward slash) (For any reserved character you need the same)
that would mean if you want to use hyphen as a delimiter just as you used for "." (dot)
So you regex will become


Note that inside a character class, the normal regex meta character loose their "special powers". So the class:



can be written as:



because the DOT doesn't match "any character", just the '.' itself.

Also note that the hyphen is treated as a "range operator" only when NOT placed at the start or end of a character class. So there's no need to escape the hyphen when doing:



which makes it (a bit) easier to read.
[ December 27, 2008: Message edited by: Piet Verdriet ]
Sunil Kumar
Ranch Hand

Joined: Apr 24, 2007
Posts: 76
Thanks Piet, that was a useful piece of information about hyphen (range operator)
Piet Verdriet
Ranch Hand

Joined: Feb 25, 2006
Posts: 266
Originally posted by Sunil Kumar:
Thanks Piet, that was a useful piece of information about hyphen (range operator)


You're welcome Sunil!
Alan Moore
Ranch Hand

Joined: May 06, 2004
Posts: 262
That regex will match "0/0/00". The year part is okay, but the month and day parts should match a leading zero only if it's followed by another, non-zero digit:
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Regular Expression Question
 
Similar Threads
Suggestions to exact variable length dates out of a string?
RegExp and conditional expression
Finding Numeric value from String object
Validating date using regex and simple date format
Using java.util.regex.Pattern