aspose file tools*
The moose likes Java in General and the fly likes Email regEx- should not start with dot(.) Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Email regEx- should not start with dot(.)" Watch "Email regEx- should not start with dot(.)" New topic
Author

Email regEx- should not start with dot(.)

Gaurav Sharma
Ranch Hand

Joined: Jan 30, 2007
Posts: 30
Hello Experts

I am using below regular expression in my application:
^([a-zA-Z0-9_\-\.&\*\+='/\{\}~]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$

But the limitation with this regex is that, it accepts dot(.) at the beginning, which is a wrong behaviour.

So can anyone please help me in updating this regex so that it should not accept (.) at its start.

Many Thanks
Gaurav
Mike Simmons
Ranch Hand

Joined: Mar 05, 2008
Posts: 3018
    
  10
Take the

[a-zA-Z0-9_\-\.&\*\+='/\{\}~]+

and replace it with

[a-zA-Z0-9_\-\&\*\+='/\{\}~][a-zA-Z0-9_\-\.&\*\+='/\{\}~]*

This way you just make a slightly different expression for the first character.

--------

Alternately, just add this at the very beginning:

(?!\.)

That's called negative lookahead.
Gaurav Sharma
Ranch Hand

Joined: Jan 30, 2007
Posts: 30
Hey Simon,

Many thanks for your instant reply.

I am validating the regex using below URL
http://jakarta.apache.org/regexp/applet.html

I tried adding the suggested fregment "(?!\.)", but it didn't worked.
Could be the reason that I was not able to add it properly.
Could you please give me full regex?

Regards
Gaurav
Mike Simmons
Ranch Hand

Joined: Mar 05, 2008
Posts: 3018
    
  10
Well, that link you gave uses the Jakarta Regexp package, which is not exactly the same as java.util.regex, which is what I was using. In particular, the Jakarta package does not support any form of lookahead. (Compare org.apache.regexp.RE with java.util.regex.Pattern.) If you want to use Jakarta Regexp, you'll have to use my first suggestion, not the second.
James Sabre
Ranch Hand

Joined: Sep 07, 2004
Posts: 781

Gaurav Sharma wrote:Hello Experts

I am using below regular expression in my application:
^([a-zA-Z0-9_\-\.&\*\+='/\{\}~]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$

But the limitation with this regex is that, it accepts dot(.) at the beginning, which is a wrong behaviour.

So can anyone please help me in updating this regex so that it should not accept (.) at its start.

Many Thanks
Gaurav


There is considerably more wrong with that regex than just the leading dot. For example
the IP address parts are allowed to be 999.

Retired horse trader.
 Note: double-underline links may be advertisements automatically added by this site and are probably not endorsed by me.
Gaurav Sharma
Ranch Hand

Joined: Jan 30, 2007
Posts: 30
Hi Simon,

I tried your first option & its working.

Now I am using :
^([a-zA-Z0-9_\-\&\*\+='/\{\}~][a-zA-Z0-9_\-\.&\*\+='/\{\}~]*)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$

Tons of thanks
Gaurav
Gaurav Sharma
Ranch Hand

Joined: Jan 30, 2007
Posts: 30
Hi James

How can we refine it more?

Regards
Gaurav
David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

It depends on how compliant you want your email addresses to be. The canonical email regex is this:You may or may not want to bother with being fully compliant.

See Mail-RFC822-Address. I use this example a lot when I think something should either (a) not be handled with regex, or (b) the focus should be on a smaller subset of the problem.
James Sabre
Ranch Hand

Joined: Sep 07, 2004
Posts: 781

Gaurav Sharma wrote:Hi James

How can we refine it more?

Regards
Gaurav


The regex for a full email address validation is very large. I published one in reply # 20 of here but I made it clear that it need a lot more testing before it should be used in production. I wrote that validator to validate RFC821 email addresses. I'm not sure what the latest RFC is but if you are to do it properly you need to find out. Having said that, you still have a problem. A lot of service providers allow email addresses to be outside the official specification so you might reject email addresses that are valid as far as your customer is concerned.

Many others have published email address regex validators of various conformance levels. Google will find them.

So why do you think you need to validate an email address? I used to think it necessary but these days I don't bother. If someone supplies aaaa@bbbb.ccc.ddd.ee your regex might accept it even though it is not going to exist.
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19760
    
  20

I always just use InternetAddress for parsing email addresses...


SCJP 1.4 - SCJP 6 - SCWCD 5 - OCEEJBD 6
How To Ask Questions How To Answer Questions
James Sabre
Ranch Hand

Joined: Sep 07, 2004
Posts: 781

Rob Prime wrote:I always just use InternetAddress for parsing email addresses...


Yep - if one is compelled to validate an email address, this looks to be by far the best approach.
 
Consider Paul's rocket mass heater.
 
subject: Email regEx- should not start with dot(.)