I’ve a page that use kcaptcha to prevent auto submit.
The captcha checking is on server.
After each input query, I’ve delete the current session that store the captcha number.
I don’t know why it’s still be spammed by someone.
captchas are a form of Turing test - to test whether someone appears to be a person rather than a bot.
If a person sits in front of your webpage with a captcha then yes, they can submit all the spam they like in that one session - and no, you cannot stop them doing this.
The captcha is designed to stop a bot which will be able to submit your form possibly up to hundreds or thousands of times per minute.
If you are getting hundreds or thousands per minute then something is wrong with your captcha/session code.
There is an entire industry built upon paying very small amounts to people who will sit in front of a computer submitting captchas on behalf of their clients.
One thing I introduced on a forum that brought SPAM to a standstill was the user had to answer 3 questions when regestering. They are not complicated but need either a bit of research or a knowlage of what the forum is about.
I was getting about 100 SPAM posts a day and since that was introduced over a year ago I have not had one. I assume that the spammers can not be bothered with the researching of the answers.
I run some local UK sites where we do not ask people to register, but to contribute, and the Turing test consists of asking people a text question and responding with a number, such as “What number is half a dozen?” – this cut out all spam too.
Maybe half a dozen is too much of a colloquialism, but the site is aimed at a fixed geographical area.
I don’t thing that they do it by manual because after I deleted all spam data and came back after 5-6 hours the records is about 1 or 2 hundreds. So I thing that there is some way of code that can auto read the image text.
ok, that link provides a potential source of your problem.
The captcha you have on that form is technically very weak.
Basically, it is a simple black text on white background and it is very easy to programatically separate out the letters and read them. The slight distortion you have on the letters is essentially useless. Now admittedly, if someone wanted to use your website to send out spam or run some malicious code they would need the code to read your “weak” captcha to break it. For someone who knows what they are doing, it’s not too difficult a task.
The same applies to “question type” captchas. They are useless against someone determined to break the captcha. All someone has to do is open up the form sufficient times to get all the questions, or at least most of them, and then get their code to provide the correct answer to the question the code encounters.
I’ve not read all the article but seem that it’s able to read the text in captcha image even complicated text.
So I thing that I need to change the captcha.
Is reCAPTCHA (google captcha) ok? Is it powerful enough to prevent the detection?
Thank you very much for the pdf article, it open my knowledge
I haven’t used reCaptcha for years now. It was before google took it over. Back then it was one of the more robust captcha (free ones at least) going around. The main thing I didn’t like about it is that you have to rely on a 3rd party server to create each captcha test and to compare the user’s response with the correct answer. I subsequently built my own captcha system based on that article. I don’t know if google have made any changes or improvements to reCaptcha.
If you’re sure your code is working correctly, then the technical weakness of your captcha image is probably an issue and so at least giving reCaptcha a try is probably worth it.
Or, if it’s possible, just change the captcha images without altering your current code.
The way hackers try to detect your letters is firstly by reducing your image (colour or not) down to black text on white using a “thresholding” technique (see the aricle). As the colours are removed the hacker hopes to have only clear well defined black text on white background. A robust, well designed colour captcha image will be such that as the colours are removed, so are at least parts if not all the parts of the letters removed. In this case the hacker never gets to see a distinguishable set of letters at the end of thresholding.
If the hacker can get your image to a state of distinguishable black letters on a white background then your captcha is effectively broken. To locate the letters the code just needs to look for the location of the black pixels. The article also discusses how poor attempts at randomly locating characters and poorly distorting them are essentlially useless.
To get a feel for how robust your captcha image is against thresholding texhniques to get black text on white background, you can use the thresholding features in something like Photoshop or Photoshop Elements (and probably other image editing applications) to see how much of the letters are still intact after thresholding down to 0.