I’m working on a script for a subscription form. The user inputs their name, Email address and CAPTCHA. The data will go into a MySql database and a cron job will send them what they subscribed to.
My question is, what’s the best way to sanitize the Email address?
Should I use something like this?
Or is there a better way.
For the name I’m just using preg replace:
I’ve come to the decision that validating an email address is pointless. If a user genuinely wants to receive mail from a site, then they’ll give you a valid and working email. But if they don’t want our spam, yet we force them to give an email anyway, then they’re going to give something like firstname.lastname@example.org. An email can be valid yet obviously fake. I think the better solution is to not require an email unnecessarily. It should always be an optional piece of information that the user can provide if and when they decide they actually want to receive something from you.
I do see your point on this, but in this case, the form is for signing up to an Email service, therefore providing a valid Email address is necessary, not doing so would be rather pointless.
No one is holding a gun to their head, forcing them to fill the form in, the user can choose to fill the form if they want our Emails, or not do so if they don’t.
Quite true, there is not a lot we can do about those. But I don’t see any harm in filtering out invalid addresses which would otherwise populate my database, forcing the cron job to go through them all every time it fires off.
Those are the ones I’m interested in, anything else can go…
I guess it will do. I don’t fancy the prospect of my writing my own regex to validate Email addresses.
To confirm that the email address is actually valid you need to send an email address with a registration link in the email. When they click on that link you then know that the email address is valid. If they don’t click within a set time (possibly a couple of days) you discard it as invalid.
Of course. Now you mention it, I have seen this before with forums and suchlike. It is one extra step for the user to carry out, but it will validate the phony addresses the do pass a validation filter.
Well… sort of. For example, I very frequently use disposable email services such as Mailinator, email@example.com will work. If the user doesn’t want to give their email, then they’re not going to give their email, and no amount of validation or verification will change that.
Ok, so it’s not 100% watertight, but those two levels of validation have got to be better than none at all. Bearing in mind the sole purpose of the form is to subscribe to emails, users who don’t want to give their address don’t fill in the form at all. But if they want to receive the emails, then are required to give a valid address.
No it isn’t - it prevents someone else giving your email address to sign you up without your permission.
It is also a legal requirement in some countries such as Australia where NOT validating that the owner of the email address really wants to receive future emails from you automatically defines all your emails as SPAM.
To me, I think an email verification should happen through a real email. The user adds an email address, wanting whatever service it is you offer, but you send an email to that user with a link to a web page, which states something like, “You are now opting in for our service, thank you very much! Click the button below to approve”. It is a step more, but then the user’s approval can be logged and stored for prosperity for laws like in Germany and Australia and the email address is 100% verified.
Apart from felgall’s reasoning why the verification is not redundant there’s one more pragmatic reason: people are known to make mistakes and a certain percentage of people will misspell their email in such a way as to change them into an invalid format - and in such cases validation will prevent collecting of invalid emails.
For the same reason I stated above I don’t think sanitizing emails makes any sense at all. If an unexpected invalid character appears in an email then there is 80% probability that the user made a mistake by providing an invalid character instead of a proper one and in this case automatic stripping of characters will not change the email to a valid one. Emails should be validated as soon as possible and if they don’t pass validation then users should re-enter them. Once they are validated then sanitization is not necessary.
Even if email addresses came from an unknown or external source I have no control over I still wouldn’t trust sanitization to turn the bad ones into good ones. If they don’t validate I simply discard them - and maybe mark them to be corrected by a human.