If you’ve used the internet at all since 2003, you’ve seen them around.

At this point, CAPTCHA tests are part of the fabric of the internet. Websites use them to distinguish humans from robots—useful if you need to protect financial information, for instance, or prevent hackers from spamming a form with hundreds of automated entries.

But how do CAPTCHA tests work, exactly, and what makes them effective against automated hacking attempts?

The goal of CAPTCHA was originally to create a type of text that computers couldn’t read.

Several research teams claim to have invented the CAPTCHA. The first (and most famous) team worked at Altavista, a search engine that was fairly popular in the late 1990s; their goal was to prevent bots from automatically adding pages to the Altavista web index.

image
Wikipedia

To create the CAPTCHA, the Altavista team studied the manual of their office scanner. The scanner came with software that could recognize text characters, and the manual included instructions for improving readability. The Altavista team decided to simply reverse the instructions, purposely creating text that would be difficult for computers to decipher.

But in 1997, another team applied for a patent for a CAPTCHA-like technology. That patent predates the Altavista team’s patent by a year. Yet, another team published a description of CAPTCHA technology in 2003, which has prompted a lengthy dispute as to who can really lay claim to the technology.

However, we do know who invented the CAPTCHA acronym.

That would be the 2003 team, led by Luis von Ahn, Manuel Blum, Nicholas J. Hopper, and John Langford.

image
Luis von Ahn, courtesy of CAPTCHA

CAPTCHA stands for “Completely Automated Public Turing test to tell Computers and Humans Apart.” Granted, the creators left out a few words to make the acronym work; “CAPTTTTCAHA” isn’t quite as catchy.

Some modern CAPTCHAs work by asking users to click certain pictures.

For instance, one popular CAPTCHA tool often asks users to “click all of the images that include a street sign.” If you’ve ever wondered whether or not to select the part of the image with the sign post (it’s part of the sign, right?) we’ve got good news for you: it doesn’t matter.

image
Optimum.net

These image-based systems work by tracking mouse movements, click locations, and other variables, not by assessing the user’s individual abilities. In fact, Google’s “No CAPTCHA reCAPTCHA” skips all of the hubbubs, instead simply asking users to click a single box.

“For most users, this dramatically simplifies the experience,” says Vinay Shet, the product manager for Google’s Captcha team, to Wired.com. “They basically get a free pass. You can solve the captcha without having to solve it.”

CAPTCHAs frustrate users because they’re purposely difficult to read.

However, there’s a method to the madness. CAPTCHA tests purposely have poor “segmentation,” which is the ability to separate letters from one another. The letters also vary in style dramatically, as computers cannot easily identify these variables.

image
Daily Mail

Most CAPTCHAs use words since humans will immediately be able to use context clues to differentiate between, say, “dog” and “dag” if the “o” and “a” look similar.

While these tests are popular, they’re controversial, as they’re often problematic for real-life human users with poor vision. Additionally, they’re not foolproof; there’s little research into the effectiveness of text-based CAPTCHAs, and some experts have suggested that they’re easily exploited.

In 2014, Google released an analysis showing that modern artificial intelligence can “solve even the most difficult variant of distorted text at 99.8 percent accuracy.”

Keep that in mind the next time you run across a text-based CAPTCHA test; you’re going through all of that trouble for…well, nothing.