This week's book giveaway is in the Clojure forum.
We're giving away four copies of Clojure in Action and have Amit Rathore and Francis Avila on-line!
See this thread for details.
Win a copy of Clojure in Action this week in the Clojure forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Question about Regex

 
josh gibson
Greenhorn
Posts: 1
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hey,
I'm trying to validate a textfield input for street address.
For the address, I used this bit of code:

however some addresses, such as "40 comm ave #3" are rejected.
Can someone help me fix my problem?
Thank you.
 
Philip Shanks
Ranch Hand
Posts: 189
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by josh gibson:
Hey,
I'm trying to validate a textfield input for street address.
For the address, I used this bit of code:

however some addresses, such as "40 comm ave #3" are rejected.
Can someone help me fix my problem?
Thank you.


Welcome to JavaRanch Josh!

First, generic address matching is not simple. There are lots of variations on the way an address can appear.

Now, for your specific example, the regex specifies one or more spaces between the hash mark and one or more subsequent digits. That rejects your example right there.

Next, if you intended to use an OR operator (the '|' character), then your grouping may not work. Don't use () for grouping unless you intend to also do capturing. If all you want is logical grouping, then use the non-capturing form (?: ).

Lastly, use the shorthand notation for [a-zA-Z], which is \w. That will make the regex a bit more readable (every little bit helps).

This matches the example address string that you provided, and I think it accomplishes what you were trying to do with the OR operator:


Hope this helps.
[ February 15, 2007: Message edited by: Philip Shanks ]
 
Anton Uwe
Ranch Hand
Posts: 122
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Lastly, use the shorthand notation for [a-zA-Z], which is \w.
Please note that "/w" is not the same as "[a-zA-Z]" because "/w" also contains the digits 0..9 and the underscore.
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Interesting. I don't think I'd seen the pound sign in valid addresses, but the USPS Guidelines show it in the examples.

BTW: I have seen valid addresses like "123 N 15th St West" in a city that also had a 15th St East. Might have been Boston.

If you seriously need correct addresses, look into vendor packages and USPS offerings that match, validate and standardize addresses. Some even know whether a building exists or requires an apartment number. Not cheap, but some of this cleanup can save big bucks on postage.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Other valid addresses include

Po Box 123

22B Baker St.

123 1/2 Main St.

RR03 Box 38D

RR 3 Box 38D

And those are just valid US formats. If you allow for customers in other countries (or, hey, we don't even know if you're in teh US to begin with) then who knows what other formats may be valid. I would suggest that your validation should either be much, much more flexible, or much more precise about all of the possibilities that are really allowed. For the latter, as Stan suggests, if you need it, you should probably pay for a professional, well-tested solution unless you plan to spend a lot more time on this.
[ February 15, 2007: Message edited by: Jim Yingst ]
 
Paul Clapham
Sheriff
Pie
Posts: 20185
26
MySQL Database
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Not to mention cities in Utah with street addresses like "121 North 300 West".
 
I agree. Here's the link: http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic