Fascinating project. The idea is that optical character recognition processes in place at the world’s largest book digitization projects naturally make lots of mistakes, and encounter plenty of computer-unrecognizable words – especially with older books or books printed with messier inks or using less-precise fonts. Rather than having staffers laboriously read every word of every book just to correct the clinkers, reCAPTCHA puts the hive mind to work, every time a member of the public solves a captcha.
About 60 million CAPTCHAs are solved by humans around the world every day. In each case, roughly ten seconds of human time are being spent. Individually, that’s not a lot of time, but in aggregate these little puzzles consume more than 150,000 hours of work each day. What if we could make positive use of this human effort? reCAPTCHA does exactly that by channeling the effort spent solving CAPTCHAs online into “reading” books.
I’m going to replace a few captchas I’ve got in place at the J-School with reCAPTCHAs. I’d been meaning to add audio accessibility to them anyway, and reCAPTCHA has an audio option built in. Being able to contribute to book digitization is delicious gravy.
Update: Adding this video thanks to Jeremy: