You are probably aware of Google’s efforts to digitize as many books as possible, scanning (literally) tons of print books into their computers. Sometimes there are technical glitches or errors when scanning these books in: print may be smudged or damaged, especially if it is very old. So Google came up with a brilliant yet simple solution to this problem: get people to unwittingly do it for them, for free.
You see, even now, there are some things that are very easy for a person to do, but very difficult for a computer to do. Recognizing distorted text turns out to be one of these things. Google doesn’t want to pay an army of people to read through these books and transcribe them. That would take forever, to say nothing of the cost (although I’m pretty sure Google is more concerned with the time, than the cost). So how do you trick people into doing this for you?
You might not know the term, but a “captcha” is one of those things you have to fill out on the Internet to prove you’re not a spambot. You know, look at the image and write the text that you see, something like:
Well, if you’re using Google’s free reCaptcha product on your site (and who wouldn’t? If it’s good enough for Google, it’s good enough for me!), it not only verifies you’re not a robot, but also helps decode a particularly tricky word.
Take the example in the picture above. The first part (Years) decides if you’re a robot or not. The second part (maybe subioik?) is a picture of a word from a paper book which was scanned in, but not understood by the Google computers. If enough people answer the same thing on that captcha, the Google hive mind can assume that that image correctly maps to subioik.
In other words, it might take a computer a million years to run algorithms to figure that out, but Google could accomplish the same in about 30 seconds by harnessing millions of people trying to enter porn sites.
This is such a simple, obvious solution to the “smudged word” problem. It’s so smart it’s almost scary, like encountering an alien intelligence that is so far beyond my capacity. In fact, when I first heard about this, I dismissed it as some kind of crazy conspiracy theory. Not so. Information is available directly from Google.
I don’t see anything wrong with them doing this per say, but it still leaves a funny feeling in my stomach. I feel manipulated somehow. And anybody that ingenious is a little scary. So far they seem to be using their power for good, but what if that changed? I don’t know man…for some reason, this disturbs me greatly.
Link via Sylvain