Regular Expressions for email

Hi,

I normally don’t ask this question as i don’t valudate emails this way normally. But i need to validate an email address to support the following rules:

  • The first and last character BEFORE the @ sybmol are alphanumeric
  • Allow special characters BEFORE the @ symbol
  • Prevent special characters AFTER the @ symbol
  • Have at least one . AFTER the @ symbol

Can anyone please help me out? This is what i currently have:

^([a-zA-Z0-9_\-\.]+)@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$

Those two mentioned are the longest ones and so {2,6} should cover it properly - except for where you are checking for an IP address after the @ instead of a domain name - ‘joe bloggs’@127.125.3.67 would be a valid email address if the IP address were in the public IP address range…

The OP still hasn’t said what language they need to do this in. At least some languages have an email validation built into the language and so don’t require a regular expression.

For example in PHP the if statement you need to test if an email address is valid or not is:

if(filter_var($email, FILTER_VALIDATE_EMAIL)){

Thanks John,

The backslash doesn’t work i tried this:

[1]+([a-zA-Z0-9_\-\.\!#$%&()+,-.\/{}|~\"]+)[a-zA-Z0-9]+@[a-z0-9-]+(\.[a-z0-9-]+)(\.[a-z]{2,})$

But i get a parse error. is this because i am using a ASP.NET Required Field Validator?


  1. a-zA-Z0-9 ↩︎

Actually just realized limiting to 4 chars on the TLD is still too restrictive as there are also longer ones like .travel and .museum so to fix that, we could just require a minimum of 2 chars at the end after the last period by using {2,} rather than {2,4}:

/^[a-zA-Z]+([a-zA-Z0-9_\\-\\.\\!#$%&()*+,-.\\/{}|~]+)[a-zA-Z0-9]+@[a-z0-9-]+(\\.[a-z0-9-]+)*(\\.[a-z]{2,})$/;

Surely is possible. I’ve amended the regex to force the first char of the email to be alpha only, and there to be at least 1 alphanumeric char to precede the @ symbol.

In addition to this, I’m also allowing for TLDs up to 4 chars in length (e.g. .mobi and .asia)

I’ve also removed the single quotes from the email. While I realize that it’s valid, in my experience there is a high fail rate when trying to use


r = /^[a-zA-Z]+([a-zA-Z0-9_\\-\\.\\!#$%&()*+,-.\\/{}|~]+)[a-zA-Z0-9]+@[a-z0-9-]+(\\.[a-z0-9-]+)*(\\.[a-z]{2,4})$/;

//test a regex against a value
var testRegex = function(rgMask, valueToTest) {
	retVal = rgMask.test(valueToTest);
	rgMask.exec(valueToTest);
	return retVal;
}
/* I used this regex test method to test a few addresses:

//

email@example.com // true
1email@example.com // false
email@example.com // false
ema+il@example.com // true
email@example.com.au // true
email@example.mobi // true
e1mail@example.com // true
email1@example.com // true
email#@example.com // false
em)(*&%$#,-.!/ail@example.com // true
email@example.something.ru // true
//


*/


In regular expressions how do i allow for quotation marks?

You have to escape them with a backslash.

In regular expressions how do i allow for quotation marks?

WOW! That’s huge!

Well the expression below:

^([a-zA-Z0-9_\-\.\!‘’#$%&'()+,-./{}|~]+)@[a-z0-9-]+(\.[a-z0-9-]+)(\.[a-z]{2,3})$

Almost works, the ONLY thing i need to ensure is that the FIRST character BEFORE the @ symbol allows ONLY alpha and the LAST character BEFORE the @ allows alphanumeric characters…

Is this possible?

If you are looking to do a completely valid email address – as in out to full RFC 822 – in a regular expression it takes roughly half a page.

Much easier to separate the problem by splitting the address at the @ sign and validating each segment independently.

Why do you want that particular small subset of email addresses? If we knew why you want to disallow all the other valid email addresses then we might be able to suggest a better way.

Well having special characters in an email address is actually valid which i didn’t know. Characters such as () are valid. So that’s the first thing i need to check.

Secondly, all email addresses need to have a . after the @ symbol and can’t have special characters after the @ as standard.

Finally, having the first and last character as alphanumeric before hte @ is a client request.

Is the above possible?

I’ve been bitten more times than I’d care to mention when it comes to writing regular expressions for email addresses. You’ll either be too allowing or too restrictive. After a bit of Googling I’ve found what is claimed to be the only RFC-valid regex for emails.

[\\040\	]*(?:\\([^\\\\\\x80-\\xff\
\\015()]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x80-\\ xff\
\\015()]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\015()]*)*\\))[^\\\\\\x80-\\xf f\
\\015()]*)*\\)[\\040\	]*)*(?:(?:[^(\\040)<>@,;:".\\\\\\[\\]\\000-\\037\\x80-\\x ff]+(?![^(\\040)<>@,;:".\\\\\\[\\]\\000-\\037\\x80-\\xff])|"[^\\\\\\x80-\\xff\
\\015 "]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\015"]*)*")[\\040\	]*(?:\\([^\\\\\\x80-\\ xff\
\\015()]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x80-\\xff\
\\015()]*(?:\\\\[^\\x80 -\\xff][^\\\\\\x80-\\xff\
\\015()]*)*\\))[^\\\\\\x80-\\xff\
\\015()]*)*\\)[\\040\	]* )*(?:\\.[\\040\	]*(?:\\([^\\\\\\x80-\\xff\
\\015()]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\ \\\\x80-\\xff\
\\015()]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\015()]*)*\\))[^\\\\\\ x80-\\xff\
\\015()]*)*\\)[\\040\	]*)*(?:[^(\\040)<>@,;:".\\\\\\[\\]\\000-\\037\\x8 0-\\xff]+(?![^(\\040)<>@,;:".\\\\\\[\\]\\000-\\037\\x80-\\xff])|"[^\\\\\\x80-\\xff\
 \\015"]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\015"]*)*")[\\040\	]*(?:\\([^\\\\\\x 80-\\xff\
\\015()]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x80-\\xff\
\\015()]*(?:\\\\[^ \\x80-\\xff][^\\\\\\x80-\\xff\
\\015()]*)*\\))[^\\\\\\x80-\\xff\
\\015()]*)*\\)[\\040 \	]*)*)*@[\\040\	]*(?:\\([^\\\\\\x80-\\xff\
\\015()]*(?:(?:\\\\[^\\x80-\\xff]|\\([ ^\\\\\\x80-\\xff\
\\015()]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\015()]*)*\\))[^\\ \\\\x80-\\xff\
\\015()]*)*\\)[\\040\	]*)*(?:[^(\\040)<>@,;:".\\\\\\[\\]\\000-\\037\\ x80-\\xff]+(?![^(\\040)<>@,;:".\\\\\\[\\]\\000-\\037\\x80-\\xff])|\\[(?:[^\\\\\\x80- \\xff\
\\015\\[\\]]|\\\\[^\\x80-\\xff])*\\])[\\040\	]*(?:\\([^\\\\\\x80-\\xff\
\\015() ]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x80-\\xff\
\\015()]*(?:\\\\[^\\x80-\\xff][^\\\\\\ x80-\\xff\
\\015()]*)*\\))[^\\\\\\x80-\\xff\
\\015()]*)*\\)[\\040\	]*)*(?:\\.[\\04 0\	]*(?:\\([^\\\\\\x80-\\xff\
\\015()]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x80-\\xff\\ n\\015()]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\015()]*)*\\))[^\\\\\\x80-\\xff\
\\ 015()]*)*\\)[\\040\	]*)*(?:[^(\\040)<>@,;:".\\\\\\[\\]\\000-\\037\\x80-\\xff]+(?! [^(\\040)<>@,;:".\\\\\\[\\]\\000-\\037\\x80-\\xff])|\\[(?:[^\\\\\\x80-\\xff\
\\015\\[\\ ]]|\\\\[^\\x80-\\xff])*\\])[\\040\	]*(?:\\([^\\\\\\x80-\\xff\
\\015()]*(?:(?:\\\\[^\\ x80-\\xff]|\\([^\\\\\\x80-\\xff\
\\015()]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\01 5()]*)*\\))[^\\\\\\x80-\\xff\
\\015()]*)*\\)[\\040\	]*)*)*|(?:[^(\\040)<>@,;:". \\\\\\[\\]\\000-\\037\\x80-\\xff]+(?![^(\\040)<>@,;:".\\\\\\[\\]\\000-\\037\\x80-\\xff] )|"[^\\\\\\x80-\\xff\
\\015"]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\015"]*)*")[^ ()<>@,;:".\\\\\\[\\]\\x80-\\xff\\000-\\010\\012-\\037]*(?:(?:\\([^\\\\\\x80-\\xff\
\\0 15()]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x80-\\xff\
\\015()]*(?:\\\\[^\\x80-\\xff][ ^\\\\\\x80-\\xff\
\\015()]*)*\\))[^\\\\\\x80-\\xff\
\\015()]*)*\\)|"[^\\\\\\x80-\\xff\\ n\\015"]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\015"]*)*")[^()<>@,;:".\\\\\\[\\]\\ x80-\\xff\\000-\\010\\012-\\037]*)*<[\\040\	]*(?:\\([^\\\\\\x80-\\xff\
\\015()]*(? :(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x80-\\xff\
\\015()]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80- \\xff\
\\015()]*)*\\))[^\\\\\\x80-\\xff\
\\015()]*)*\\)[\\040\	]*)*(?:@[\\040\	]* (?:\\([^\\\\\\x80-\\xff\
\\015()]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x80-\\xff\
\\015 ()]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\015()]*)*\\))[^\\\\\\x80-\\xff\
\\015() ]*)*\\)[\\040\	]*)*(?:[^(\\040)<>@,;:".\\\\\\[\\]\\000-\\037\\x80-\\xff]+(?![^(\\0 40)<>@,;:".\\\\\\[\\]\\000-\\037\\x80-\\xff])|\\[(?:[^\\\\\\x80-\\xff\
\\015\\[\\]]|\\\\ [^\\x80-\\xff])*\\])[\\040\	]*(?:\\([^\\\\\\x80-\\xff\
\\015()]*(?:(?:\\\\[^\\x80-\\ xff]|\\([^\\\\\\x80-\\xff\
\\015()]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\015()]* )*\\))[^\\\\\\x80-\\xff\
\\015()]*)*\\)[\\040\	]*)*(?:\\.[\\040\	]*(?:\\([^\\\\\\x80 -\\xff\
\\015()]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x80-\\xff\
\\015()]*(?:\\\\[^\\x 80-\\xff][^\\\\\\x80-\\xff\
\\015()]*)*\\))[^\\\\\\x80-\\xff\
\\015()]*)*\\)[\\040\	 ]*)*(?:[^(\\040)<>@,;:".\\\\\\[\\]\\000-\\037\\x80-\\xff]+(?![^(\\040)<>@,;:".\\\\ \\[\\]\\000-\\037\\x80-\\xff])|\\[(?:[^\\\\\\x80-\\xff\
\\015\\[\\]]|\\\\[^\\x80-\\xff]) *\\])[\\040\	]*(?:\\([^\\\\\\x80-\\xff\
\\015()]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x 80-\\xff\
\\015()]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\015()]*)*\\))[^\\\\\\x80 -\\xff\
\\015()]*)*\\)[\\040\	]*)*)*(?:,[\\040\	]*(?:\\([^\\\\\\x80-\\xff\
\\015( )]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x80-\\xff\
\\015()]*(?:\\\\[^\\x80-\\xff][^\\\\ \\x80-\\xff\
\\015()]*)*\\))[^\\\\\\x80-\\xff\
\\015()]*)*\\)[\\040\	]*)*@[\\040\	 ]*(?:\\([^\\\\\\x80-\\xff\
\\015()]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x80-\\xff\
\\0 15()]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\015()]*)*\\))[^\\\\\\x80-\\xff\
\\015 ()]*)*\\)[\\040\	]*)*(?:[^(\\040)<>@,;:".\\\\\\[\\]\\000-\\037\\x80-\\xff]+(?![^( \\040)<>@,;:".\\\\\\[\\]\\000-\\037\\x80-\\xff])|\\[(?:[^\\\\\\x80-\\xff\
\\015\\[\\]]| \\\\[^\\x80-\\xff])*\\])[\\040\	]*(?:\\([^\\\\\\x80-\\xff\
\\015()]*(?:(?:\\\\[^\\x80 -\\xff]|\\([^\\\\\\x80-\\xff\
\\015()]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\015() ]*)*\\))[^\\\\\\x80-\\xff\
\\015()]*)*\\)[\\040\	]*)*(?:\\.[\\040\	]*(?:\\([^\\\\\\x 80-\\xff\
\\015()]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x80-\\xff\
\\015()]*(?:\\\\[^ \\x80-\\xff][^\\\\\\x80-\\xff\
\\015()]*)*\\))[^\\\\\\x80-\\xff\
\\015()]*)*\\)[\\040 \	]*)*(?:[^(\\040)<>@,;:".\\\\\\[\\]\\000-\\037\\x80-\\xff]+(?![^(\\040)<>@,;:". \\\\\\[\\]\\000-\\037\\x80-\\xff])|\\[(?:[^\\\\\\x80-\\xff\
\\015\\[\\]]|\\\\[^\\x80-\\xff ])*\\])[\\040\	]*(?:\\([^\\\\\\x80-\\xff\
\\015()]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\ \\x80-\\xff\
\\015()]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\015()]*)*\\))[^\\\\\\x 80-\\xff\
\\015()]*)*\\)[\\040\	]*)*)*)*:[\\040\	]*(?:\\([^\\\\\\x80-\\xff\
\\015 ()]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x80-\\xff\
\\015()]*(?:\\\\[^\\x80-\\xff][^\\ \\\\x80-\\xff\
\\015()]*)*\\))[^\\\\\\x80-\\xff\
\\015()]*)*\\)[\\040\	]*)*)?(?:[^ (\\040)<>@,;:".\\\\\\[\\]\\000-\\037\\x80-\\xff]+(?![^(\\040)<>@,;:".\\\\\\[\\]\\000- \\037\\x80-\\xff])|"[^\\\\\\x80-\\xff\
\\015"]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\\ n\\015"]*)*")[\\040\	]*(?:\\([^\\\\\\x80-\\xff\
\\015()]*(?:(?:\\\\[^\\x80-\\xff]| \\([^\\\\\\x80-\\xff\
\\015()]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\015()]*)*\\)) [^\\\\\\x80-\\xff\
\\015()]*)*\\)[\\040\	]*)*(?:\\.[\\040\	]*(?:\\([^\\\\\\x80-\\xff \
\\015()]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x80-\\xff\
\\015()]*(?:\\\\[^\\x80-\\x ff][^\\\\\\x80-\\xff\
\\015()]*)*\\))[^\\\\\\x80-\\xff\
\\015()]*)*\\)[\\040\	]*)*( ?:[^(\\040)<>@,;:".\\\\\\[\\]\\000-\\037\\x80-\\xff]+(?![^(\\040)<>@,;:".\\\\\\[\\]\\ 000-\\037\\x80-\\xff])|"[^\\\\\\x80-\\xff\
\\015"]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\ xff\
\\015"]*)*")[\\040\	]*(?:\\([^\\\\\\x80-\\xff\
\\015()]*(?:(?:\\\\[^\\x80-\\x ff]|\\([^\\\\\\x80-\\xff\
\\015()]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\015()]*) *\\))[^\\\\\\x80-\\xff\
\\015()]*)*\\)[\\040\	]*)*)*@[\\040\	]*(?:\\([^\\\\\\x80-\\x ff\
\\015()]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x80-\\xff\
\\015()]*(?:\\\\[^\\x80- \\xff][^\\\\\\x80-\\xff\
\\015()]*)*\\))[^\\\\\\x80-\\xff\
\\015()]*)*\\)[\\040\	]*) *(?:[^(\\040)<>@,;:".\\\\\\[\\]\\000-\\037\\x80-\\xff]+(?![^(\\040)<>@,;:".\\\\\\[\\ ]\\000-\\037\\x80-\\xff])|\\[(?:[^\\\\\\x80-\\xff\
\\015\\[\\]]|\\\\[^\\x80-\\xff])*\\] )[\\040\	]*(?:\\([^\\\\\\x80-\\xff\
\\015()]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x80- \\xff\
\\015()]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\015()]*)*\\))[^\\\\\\x80-\\x ff\
\\015()]*)*\\)[\\040\	]*)*(?:\\.[\\040\	]*(?:\\([^\\\\\\x80-\\xff\
\\015()]*( ?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x80-\\xff\
\\015()]*(?:\\\\[^\\x80-\\xff][^\\\\\\x80 -\\xff\
\\015()]*)*\\))[^\\\\\\x80-\\xff\
\\015()]*)*\\)[\\040\	]*)*(?:[^(\\040)< >@,;:".\\\\\\[\\]\\000-\\037\\x80-\\xff]+(?![^(\\040)<>@,;:".\\\\\\[\\]\\000-\\037\\x8 0-\\xff])|\\[(?:[^\\\\\\x80-\\xff\
\\015\\[\\]]|\\\\[^\\x80-\\xff])*\\])[\\040\	]*(?: \\([^\\\\\\x80-\\xff\
\\015()]*(?:(?:\\\\[^\\x80-\\xff]|\\([^\\\\\\x80-\\xff\
\\015()] *(?:\\\\[^\\x80-\\xff][^\\\\\\x80-\\xff\
\\015()]*)*\\))[^\\\\\\x80-\\xff\
\\015()]*) *\\)[\\040\	]*)*)*>)

As you can see, it’s a bit crazy…

I find that the best solution is to use a basic regex (check if there is a @ somewhere near the middle) and to send a verification email.

I’m testing with this and it looks like it is working:

^([a-zA-Z0-9_\-\.\!‘’#$%&'()+,-./{}|~]+)@[a-z0-9-]+(\.[a-z0-9-]+)(\.[a-z]{2,3})$

However, i agree with you as the email should not start with a number. Is there a way i can prevent the first character from being a number?

Just looked into this and it looks like you are right.

So just to change the rules slightly i would need to check for this:

  • Allow special characters BEFORE the @ symbol
  • The first and last character BEFORE the @ sybmol are alpha
  • Prevent special characters AFTER the @ symbol
  • Have at least one . AFTER the @ symbol

How far off am i with the current expression i have?

If I recall correctly, email addresses can’t start with a number

If you would like to validate if an e-mail address is according to the relevant RFC and if you are using perl then there is a module that already encapsulates this check. It is called Email::Valid https://metacpan.org/module/Email::Valid

If you have some special rules in addition to the standard I’d first check if the e-mail is in the standard then use the specialized regex.