Bot spamming on forms


#1

Hey everyone.

I have a client who has a old website built using tables and adobe plugins, with no ssl. It is a rather old website in terms of the structure and how it was built, and uses http.

the site is getting bombarded with spam via the contact form. I was wondering, does the out-dated website structure & http make it easier for bots to send spam? Does it actually affect it at all?


Some questions about a local classified ads website?
#2

Hi there, the contact form is public and it seems it has no security mechanisms in place.
There are a few common security features that you can apply to your form to prevent spam.
Probably the most popular is the captcha, however this affects usability in a negative way as captchas are normally quite hard to figure out for the user. Assume some users will give up before being able to submit the form.

So I’m going to favour a few other mechanisms that happen under the hood and are unobtrusive.

One of those is the token. The idea is that on every page request that prints the form you generate a unique token and put it in a hidden field in the form, as well as store it in the server session. Then once the form is submitted you can check that the hidden field’s token is the same as the one stored in the session and if it’s not then you invalidate the submission. This prevents bots from trying to brute force the form and it’s specially useful in login forms. If you combine this with time… the time elapsed between the generation of the token and the submission of the form then you can check that the form was not submitted too quickly. Bots normally will submit forms much quicker than a human so what I normally do is check if the time elapsed is less than a second and if it is then I invalidate the submission.

The other method I can think of is the honey-pot method, and this will probably get rid of a big chunk of spam but not all… It consists in having a text field in your form that is not visible to the user. Bots will normally try to fill in all fields so once the form is submitted you can check if this field has a value and if it does then you invalidate the submission.

I hope these tips have helped you.

All the best…


#3

Not all of them. Invisible recaptcha does a nice job of leaving the user alone where there is enough evidence that they’re not a bot (it checks mouse movements, key stroke patterns, etc).


#4

You can also check the HTTP_REFERER. If it is bots that is often missing or blank.


#5

Thanks everyone for your suggestions!

I’m going to apply the methods mentioned above. I’ll go ahead and check up on the invisible recaptcha! seems like a big improvement over v2. Maybe I can do both?

One thing: does the site being http affect it at all? Does it make it easier for bots to bypass security measures?


#6

You should check to make sure the form isn’t vulnerable to header injection. i.e. What may look like spamming of your form, may in fact be the form being used to BCC spam many others.


#7

It should not make a difference, if the form is not able to filter out spam-bots, you may get spam.
I have used methods like those described by @Andres_Vaquero in combination, the token, the honeypots and form timing, and it has so far not let any spam through, without causing the users to jump through any hoops.

In addition I use quite rigorous and strict validation on all fields. Checking that all fields that should be set, are set. Then checking the data type/content of every field for what is expected/allowed. The script knows which fields have user set values and which are system set values. The system set fields can be made much more inflexible in what values are considered valid. Any deviation from the expected data should trip the alarm.

One caveat I found from experience when storing form tokens in sessions. It has been known to trigger false positives if the user has more than one instance of the form open in different browser tabs. A token may be overwritten in the session, then fail to match.
The workaround was to store info for each instance of a form separately with its own unique ID.


#8

A lot of bots (especially old ones) dont support javascript so:

  • you can use it to add extra field to form, if its missing in validation it means it was a bot
  • measure time between page load and submit form (this will work for more advanced bots)
  • add recaptcha, though there are captcha breaking services, you might invent your creative questions

It all depends if their site form was “targeted” or just auto-collected by some bot software.


#9

…Or someone who is not running javascript in their browser.
To use this you would have to have a note telling people that they must be running javascript to use the form.

That may be a problem with the common captcha methods, so inventing your own may be an idea.

Targeted attacks will be harder to guard against. But I think the majority are automated and not too smart, they are just scouting for weaknesses to exploit.
I think most “small-time” sites don’t often get specifically targeted attacks.


#10

“Or someone who is not running javascript in their browser.
To use this you would have to have a note telling people that they must be running javascript to use the form.”

  • so now even google requires js to login, 95% users not using JS are bots, probably more :slight_smile:

" That may be a problem with the common captcha methods, so inventing your own may be an idea"
Recaptcha can be easily beaten, just jsut deathbycaptcha, decaptcher, other services (or autoit lol). if you know how to program its just a couple of sentences and unique string that are not in databases of standard questions and answers.


#11

This is the method I normally use and I found this article https://kiwee.eu/stop-form-spam-robots-honeypot/ is very helpful.


#12

Skimming through it, that article seems to combine the three solutions I described. Good one!


#13

The ultimate way to stop spambots is to use the Akismet plugin.


#14

I wouldn’t describe Akismet as the “ultimate way to stop spambots”, @puho, although I’ve never used it in this context.

We use it on the forums, and it requires a moderator to check every post which is tagged as Spam. Quite a high percentage of those are not Spam. We also get real Spam posts which are not caught by Akismet at all.

Do you have experience of using it with forms which you can share?


#15

I would say using multiple techniques would handle this problem. A combination of Honeypot and some kind of way to make the bot have to input something would help.


#16

Thanks everyone for your replies!

Okay so these bots are doing something that I can’t seem to figure out.

In their spam emails, they are literally putting everything in the form element, even the submit button!

So like:
"first name: xxx
last name: yyy:
submit: Send Now
"

Some of the spam email even contained the captcha response!

Could this be happening due to poor validation?


#17

It could be happening due to not having some of the other spam filters we’ve talked about… the more methods you combine, even getting creative would help, the more chances you have of filtering out those little buggers.
The honeypot method is perhaps the one that will return most benefit from the least effort, but assume some robots already know and work around it


#18

You should be able to stop these.
The honeypot already mentioned should work if it’s filling in everything.
Also…

This is what I meant here:-

You know exactly what the submit value should be, so you put a test in place for that.

if($_POST['submit'] != 'value it should be'){
      // You got a Bot!!
}

#19

I figured out the issue. I was implementing the honey pot method incorrect, and its working as intended now, thanks again everyone for your help and input.

One question:
The website is built using inline css and tables, its a very old website. My only worry is that since the bots are viewing the source html, they can see that the honey pot fields are set to “style:display:none” and they can easily by pass it.

Should I add an external CSS styling just for these 2 honey pot fields? Obviously the whole site should be done with an external css style in mind, but keeping that point aside.