GeeCON Prague 2014*
The moose likes Beginning Java and the fly likes Question about Regex Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Java » Beginning Java
Bookmark "Question about Regex" Watch "Question about Regex" New topic
Author

Question about Regex

josh gibson
Greenhorn

Joined: Feb 14, 2007
Posts: 1
Hey,
I'm trying to validate a textfield input for street address.
For the address, I used this bit of code:

however some addresses, such as "40 comm ave #3" are rejected.
Can someone help me fix my problem?
Thank you.
Philip Shanks
Ranch Hand

Joined: Oct 15, 2002
Posts: 189
Originally posted by josh gibson:
Hey,
I'm trying to validate a textfield input for street address.
For the address, I used this bit of code:

however some addresses, such as "40 comm ave #3" are rejected.
Can someone help me fix my problem?
Thank you.


Welcome to JavaRanch Josh!

First, generic address matching is not simple. There are lots of variations on the way an address can appear.

Now, for your specific example, the regex specifies one or more spaces between the hash mark and one or more subsequent digits. That rejects your example right there.

Next, if you intended to use an OR operator (the '|' character), then your grouping may not work. Don't use () for grouping unless you intend to also do capturing. If all you want is logical grouping, then use the non-capturing form (?: ).

Lastly, use the shorthand notation for [a-zA-Z], which is \w. That will make the regex a bit more readable (every little bit helps).

This matches the example address string that you provided, and I think it accomplishes what you were trying to do with the OR operator:


Hope this helps.
[ February 15, 2007: Message edited by: Philip Shanks ]

Philip Shanks, SCJP - Castro Valley, CA
My boss never outsources or has lay-offs, and He's always hiring. I work for Jesus! Prepare your resume!
Anton Uwe
Ranch Hand

Joined: Jan 10, 2007
Posts: 122
Lastly, use the shorthand notation for [a-zA-Z], which is \w.
Please note that "/w" is not the same as "[a-zA-Z]" because "/w" also contains the digits 0..9 and the underscore.
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
Interesting. I don't think I'd seen the pound sign in valid addresses, but the USPS Guidelines show it in the examples.

BTW: I have seen valid addresses like "123 N 15th St West" in a city that also had a 15th St East. Might have been Boston.

If you seriously need correct addresses, look into vendor packages and USPS offerings that match, validate and standardize addresses. Some even know whether a building exists or requires an apartment number. Not cheap, but some of this cleanup can save big bucks on postage.


A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Other valid addresses include

Po Box 123

22B Baker St.

123 1/2 Main St.

RR03 Box 38D

RR 3 Box 38D

And those are just valid US formats. If you allow for customers in other countries (or, hey, we don't even know if you're in teh US to begin with) then who knows what other formats may be valid. I would suggest that your validation should either be much, much more flexible, or much more precise about all of the possibilities that are really allowed. For the latter, as Stan suggests, if you need it, you should probably pay for a professional, well-tested solution unless you plan to spend a lot more time on this.
[ February 15, 2007: Message edited by: Jim Yingst ]

"I'm not back." - Bill Harding, Twister
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18570
    
    8

Not to mention cities in Utah with street addresses like "121 North 300 West".
 
GeeCON Prague 2014
 
subject: Question about Regex