Currently on my website I've been using CAPTCHA for private message conversation reply forms, commenting forms, and almost every form page on the website (that creates new records).
It gets pretty expensive for the server to have to create, draw and store (the text in the db) the CAPTCHA-related info EVERYTIME a form is rendered (even if the form isn't even filled out).
I've thought about having a spam filter or a anti-flooding system for the website, but I'm not really sure 100% of how to implement it. Should I create a table for all the recent requests? Should I create a minimum time delay (say 30 seconds) between form posts?
Even if I don't use CAPTCHA anymore, there's still the problem of bots posting data onto the website automatically ... as long as its between the time periods.
What do you guys have in mind?
You could always outsource your CAPTCHA to ReCAPTCHA, that would move the strain from your server to theirs.
Akismet might also be worthwhile to look at.
You could also pregenerate a bunch of them and store on the filesystem. eg
Then store the text for each in the db, which will be very low overhead since you just pick one randomly.
If you're really concerned someone will write a bot which a human can spoonfeed the question/answers to, so it can later recognise one of these known images on its own, you could do a non realtime garbage collection and regeneration routine, so they gradually get cycled out and won't last too long, making it difficult for them to ever get a good chance of getting a known image.
Do you *need* a CAPTCHA on every form?
On my site, I am removing captcha as much as I can. Its very annoying to the end user.
Only register form has a captcha. If the input form is only available to the logged in users then I guess there's not much need of captcha as a secure login can deter a lot of bots.
I agree, I've used AKISMET for another website (a blog), but my current website isn't a blog so I'm wondering if I could still use it.
Originally Posted by khuramyz
If, for some reason, you ever need to have a CAPTCHA check on every form, you can store a temporary cookie after the user has solved one CAPTCHA to remember that the user is a human.
But then a bot would only need to produce this cookie to bypass all CAPTCHAs, you would then have to introduce an authentication system just for the cookie...
I guess you don't need a blog, just a wordpress account.
Originally Posted by sk89q
Good idea, but you would still need to have a spam or floood filter on the backend to filter any brute force traffic.
Originally Posted by sk89p
Lets say that a user proves that he is not a machine and then gets the priviledge of not having to decode letters from an image, then what about if someone brute forces using that account ... no CAPTCHA = no security.
OK so lets say that you do have something on the backend to filter the spam, this works, but you would need to create some sort of flooding system. This is kinda countering the idea of using CAPTCHA, because with a CAPTCHA image, as long as you enter in the information then you're fine (post as much as you want).
True, but if were gonna talk about defending against human assisted bots, were gonna be talking for a lonnnnnnng time :)
Flood controll and captcha are different things. Using the natural flood controll effect of a captcha limits you to the speed a human can enter characters into a text field while looking at an image. It would probably take me all of 15 minutes to write a bot that will present me with your captcha image and a text box. I solve the question, my bot forwards it to your site for me and includes a spam message. It then immediately fetches the next imge and presents to me again. I can probably get a message sent every 5 seconds now, because it doesnt take long for me to read an image and type, and there isn't any form of flood controll.
So what are you suggesting?
Well, most sites don't run into a problem of flooding. Flooding one site with a bunch of advertisements at one time is not effective when your point is to advertise. One SQL query could delete a spammer's work. Unless you fear that you will receive continual targeted attacks on your site, it should not be an issue.
Most sites have the problem with entirely automated bots that scan the Internet for forms to fill out. One CAPTCHA is good enough to defeat these bots. You don't even need a complicated CAPTCHA to defeat these.
I'm just suggesting there comes a point where you need to draw the line, and that point is usually around when you start trying to defend against humans(or human assisted bots).
If someone specifically wants your site bad enough code a bot specifically to defeat your proprietary system, how much of a difference will it make if they only need to give it the answer once, and then it can use the "authorized" cookie to post repeated messages, over them having to actually supply a new answer for each message? Probably not much, but maybe the site really is that desirable, and requiring a human to babysit the bot would make it financially unfeasible if the goal is to send massive amounts of posts.
Perhaps having a random number (say 1/5) determine if a CAPTCHA will appear. This way, if a bot does get to shoot a ton of requests, then they might get a few requests in before the random CAPTCHA number appears. However, as soon the bot hits a CATPCHA input, and gets it wrong, then it'll have to continuously INPUT in the characters correctly. If more than 10 errors are made in the CAPTCHA, then that user accuont is blocked for a few minutes.
I was thinking of doing something like this on my server:
- I generate the images and store them in my DB (have quite a few, 1000+)
- When I display and image, i select a random one, and crop it at some random position (so 0-10 px of the image margins can go away)
When users take actions that update something (my problem is so users can't update stuff to much), i add to a counter. If they update more than X things in Y time, I lock their account for 1h and show a CAPTCHA image.
They get Z tries at the CAPTCHA (every time they fail the CAPTHA changes, just in case they can't read it), and if they get it right, the account gets unlocked and their counter reset.
If they get it wrong, they could unlock their profile from some email or something.
The idea is that normal users never see this CAPTCHA, unless they update stuff to much, and then they see it once every X minutes.
Note that my problem is for people using bots on one of my browser games, so this system is not designed to stop people from gathering information from my site.
That's exactly what I'm planning to do. Not only for updates, but also for new data creation (private messages, forum posts, new images, etc...).
Let me clarify (I lost track of who was the OP earlier):
You would only remember the CAPTCHA bit for the current session for an X amount of time.
However, what I said wasn't really to solve your problem. It was an addendum to khuramyz's comment about how annoying CAPTCHAs are to end users. I assume the biggest load to your servers for CAPTCHAs would result from the many guests that visit that have no intent of submitting a form. Your server would generate a CAPTCHA image regardless, so my addendum doesn't directly solve your problem.
Are you really expecting targeted attacks against every form on your website?
What about trying RECAPTCHA too?
Use a captcha only on the login form (and any IMPORTANT subsequent forms).
I second or third the recommendation to use reCAPTCHA.
So many sites use the Captcha, some times the captcha not clear, some viewers would get away. so i think if you want more user not use the captcha. if want to control the users quantity, you should use it