Google Captures reCAPTCHA

Google reCAPTCHAGoogle has acquired reCAPTCHA Inc, the company that developed the popular spam and fraud prevention system.

CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart) are the squiggly letters or words you must read and enter when completing a web form. They’re a hindrance and are ultimately doomed by computer advances, but they can help sift the spambots from real human users.

reCAPTCHA has been implemented on 100,000 websites. It’s one of the better systems and many would say it’s the best:

  • The service is free.
  • The widget is easy to embedded in your web forms.
  • An audible version of the CAPTCHA is provided for visually-impaired users.

reCAPTCHA is also unique because it improves the process of digitizing books and other paper-based documents. The widget presents the user with two scanned words that failed optical character recognition (OCR). One of the words is known but the other requires further human verification. If the user solves the known word, reCAPTCHA assumes the second word is probably correct. The system can verify the word with a high level of confidence if several users enter the same text.

reCAPTCHA is a positive example of crowd-sourcing: it distributes a laborious process among millions of users. Books are digitized quickly and accurately, developers can be sure a real user has accessed their site, and user inconvenience is kept to a minimum.

The deal is a win-win for both companies. reCAPTCHA will benefit from Google’s financial and marketing clout. Google can replace their awful in-house CAPTCHA system and use reCAPTCHA’s digitizing facilities to assist with their ambitious and controversial book-scanning project.

Neither company has revealed the full terms of the buyout, but Google is unlikely to receive much change from $500 million.

Do you use reCAPTCHA? Does this deal make us ever-more reliant on Google services?

Links:

Related reading:

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • WebKarnage

    I’m not sure it will be such a win for us users yet, but maybe I’m just not as pro Google with things other than search. It is a very cleverly worked out system, but how Google’s ownership of it will change that is all speculation I suppose.

    Thanks for keeping us up-to-date as usual!

    with best regards,
    Karn.

  • http://www.gospelrhys.co.uk/ rhysboy84

    I use it, written plugins for it, and do like it. Hopefully Google can make it even better (though not sure how), and not “Feedburnerize” it.

  • Daim

    Captcha is okay but I think KittenAuth is a much better solution (http://www.thepcspy.com/kittenauth)

    It will certainly be less “doomed” by computer advances.

  • zoliky

    Who cares? Mollom is 1000x better than reCaptha.

  • http://www.optimalworks.net/ Craig Buckler

    @Daim
    I don’t think KittenAuth is any better — computers can already recognise images, although it’s tougher than OCR. Even so, random guessing will thwart KittenAuth 1 in 84 times. Spambots could make thousands of guesses within minutes.

    In comparison, five random letters has a 1 in 12 million chance of being correct.

  • corbyboy

    I am getting a little worried about having too much reliance on Google on my website, but the services they offer are so good that it’s hard to give them up.

    That said, as soon as the request URL is changed to recaptcha.google.com I will probably take the decision to stop using it.

  • http://www.optimalworks.net/ Craig Buckler

    @corbyboy
    I doubt Google will change the request URL and it’d be possible to support the old one even if they did.

    I wrote this article a few months ago…
    Have We Become Too Dependent on Google?

  • fproof

    Never understood the popularity of reCaptcha. To me it’s just a horrible and annoying thing. As far as I know my senses are still 100%, but how many times a tried to pass the verification without success… And the audible version isn’t much better either.
    The concept is really clever, but it’s just too difficult to pass it, especially for people without an extensive english vocabulary. Actually, we’ve tried it out on a popular website over here, but had to replace it because of too many complaints.

    Now that Google absorbed the company, I can only hope they come up with a more userfriendly solution. Maybe something based on their image rotation technique they were working on?

  • http://www.dynamicalsoftware.com gengstrand

    What a brilliant move to leverage collective intelligence. A fine example of crowd-sourcing to exploit the human need to verify your humanity in the service of digital translation of books.

  • http://pixopoint.com/ ryanhellyer

    How on earth would a captcha system be worth US$500,000? That number seems way too high.

    Where did you get that figure?

  • http://www.optimalworks.net/ Craig Buckler

    @ryanhellyer
    The consensus on several business sites is that Google will be paying somewhere in the region of $500-700 million. The full details are yet to be made public and much of that could be made up from shares or other deals.

    It does seem a lot, though — especially when you consider that it’s $5,000 per website that uses it. However, I suspect the captcha technology is of secondary interest to Google — it’s the book digitizing system they really want. You also need to look at the potential; Google have the ability to add reCAPTCHA to millions of websites.

  • anon

    One of your figures must be wrong:

    $500,000,000 for a service implemented on 100,000 websites? Thats $5000 per instance?

    Yes I appreciate its the technology and people they’re buying but frankly it’s still ridiculous.

  • fproof

    Is the whole book digitizing system of recaptcha patented then? Otherwise I can’t believe that figure either. The technology isn’t really rocket science, is it.

  • the.peregrine

    We’ve been using Phil Haack’s honeypot captcha technique for about six months, and so far we’re very happy with it.

    CAPTCHA is a hindrance, as you say, and the accessibility issues are compounded for people who have cognitive and/or visual disabilities. The honeypot captcha is an elegantly simple solution.

    To me, Google’s acquisition of reCAPTCHA is a non-event, full of sound and fury, signifying nothing.

  • the.peregrine

    Clarification: In the “honeypot captcha” link I posted above, note also the comment from Mathieu Henri about combining the honeypot technique with filtering for the timestamp. If a page is submitted less than three seconds after it’s generated, either the submitter is a bot or it’s a human error. Either way, the validation process kicks it out.

    Depending on the complexity of the form, I might allow a little more than three seconds. But you get the idea.

  • Κατασκευή Ιστοσελίδων

    To me, Google’s acquisition of reCAPTCHA is a non-event, full of sound and fury, signifying nothing.