I want to get a better understanding of security issues in the processing of contact forms, so that I can ensure that I’m doing everything that I could conceivably need to do while I work on my form’s PHP script. I had first better list what I’m doing so far:
Converting to HTML entities and trimming spaces
Checking for any unfamiliar array keys in POST (inc select options)
…for things like “content-type”, “bcc:”, etc.
…that the max length of inputs are as they should be
…each input with some regular expressions
…for common spam keywords
…and for URLs in the message
Reading that, it looks like I might know what I’m doing, but my knowledge is very patchy and I’ve had to spend a lot of time learning as a go along. Some of what I’ve added to my script may even be unnecessary or ineffective, possibly. Understanding PHP has been particularly difficult. Anyway, the questions:
Leaving aside spam for a minute, exactly what would a hacker enter into a form to try something malicious? What should I include in the regular expression that checks for such data (e.g. 3rd item in above list)? At the moment, I have a regex that I saw on another forum somewhere and it looks like this:
I would never use someone else’s script to process my form. I have tried that in the past and I never found one that would suit my demands, high among which is usability. I’ve spent the last few months learning how to build and fine-tune my own script because it’s the only way to get one that really works properly and behaves exactly as I expect.
That’s what I used to do, way back when I didn’t know enough to write my own PHP. I had to expend enough effort on hacking 3rd party scripts that I had to give up because the result wasn’t what I wanted and it was easier, by that point, to start from scratch.
But this is a bit off-topic; I just want to know if I have missed any validation methods/security tricks/etc. that would be worth using. Searching Google for advice leaves me thinking that few people have bothered to write in detail about this topic.
I seem to remember writing an article on this years ago, but I may not have posted it online anywhere. (I can’t find it.)
Most contact forms solicit four things:
The second item is what most attacks target. Because the mail() function in PHP requires the From header to be passed in its raw form, it presents the best opportunity for exploit. I suggest two steps for making sure an email address is valid:
Use ctype_print() to make sure there are no non-printable characters in the email address. This is the least you can do.
I have never heard of “Wufoo” and would not know what it is. I wanted to do it all myself and learn by doing (and reading and asking questions, of course). Someone else’s way of doing things is, in my experience, unlikely to suit me. I am extremely keen on accessibility while a lot of “professional” solutions, for various tasks, are not as accessible as they should be.
So I tend to bypass such things now for those reasons. I don’t have time constraints or clients to please, either, of course.
By the way; thanks to “shiflett” for your earlier reply, which I hadn’t seen.
Going back to your original question, here are a couple of things a hacker can try to do with a contact form:
Cross Site Scripting attack on you if you are viewing messages through a browser
Cross Site Scripting attack on your users if the message is displayed on the site somewhere (given you don’t have a db, I don’t think this a concern)
Email SPAM by various methods
Crash your web server
There are a lot of other things one can do depending on how the forms are implemented on the client and server side.
I know you already went there in other posts but still, for security sake you are better off with a commercial or community driven solution then writing your own. This advice is even stronger for dealing with SPAM.
That being said, the basic security methodology talks about white list + black list - accept only known good and block known bads.
For the name and title make sure you do not accept any special characters and escaping is a good practice (black list). From the white list approach you can probably generate a robust regular expression for how a valid name and title should look like (white list).
For the email, in addition to the patterns you described (black list), I would also add a regular expression for a standard email address for the many examples out there (white list).
For the message, again stripping off special characters and escaping is a good practice.
All the above suggestions do not consider your email client which expects to receive input in a specific format. You should make sure you do not break anything by passing input that is not appropriate.
Without seeing your specific implementation this is the best I can do.
Hope it helps.