Want To Know How Far Artificial Intelligence Has Come? Just Look At CAPTCHA

Apr 16, 2019
Originally published on April 16, 2019 6:27 pm
Copyright 2019 NPR. To see more, visit https://www.npr.org.

AUDIE CORNISH, HOST:

We're going to look now at the state of artificial intelligence this month in All Tech Considered.

(SOUNDBITE OF ULRICH SCHNAUSS' "NOTHING HAPPENS IN JUNE")

CORNISH: I'm not a robot. You've probably seen that statement online alongside a prompt that says something like, type the letters you see, or, click on all the stoplights. Do it right, and you get to go on to the next page. These games are developed by Google. They're called CAPTCHAs. Researcher Jason Polakis of the University of Illinois at Chicago has proven that, in fact, robots are pretty good at CAPTCHAs.

JASON POLAKIS: It's a very basic class of challenges that have been created in a way to have little tasks that are easy for humans to solve, but difficult for computers.

CORNISH: Is this truly a security feature, then, or a way to train artificial intelligence programs?

POLAKIS: It's actually a combination of both. So when some of the seminal original papers came out describing CAPTCHAs, they were seen both as a challenge that can prevent automated actions from computers - so something that has a very specific security spin to it - but also, the fact that you get feedback from users - they can help you train your system and train your models and then have stronger AI and machine learning techniques moving forward.

CORNISH: Now, let's talk about the moving forward, then, because they have changed over time. Have they gotten harder? And if so, why?

POLAKIS: They've gotten both harder and easier. So CAPTCHAs are actually a really good example of the arms race between defenders and attackers and security. As attackers get better and new techniques come out and machine learning improves and you can actually automatically infer, for example, what words or what letters were in the CAPTCHA, then defenders try to prevent that by making their challenges harder. And then once you reach a point where text CAPTCHAs are very hard for humans...

CORNISH: Just to jump in here, then - so what you're saying is we started out with the text CAPTCHAs.

POLAKIS: Yes.

CORNISH: So this is maybe a combination of words and letters, but they may be distorted or the letters look all kind of wiggly and wavy. And a human should be able to discern it, even if a computer can't.

POLAKIS: Exactly.

CORNISH: Why, though, (laughter) are we now in a phase of, like, click - find the three stoplights or find the three foot bridges? Like, I have, sometimes, this feeling of, like, oh, God (laughter). Like, I'm having trouble with this.

POLAKIS: The thing is that at some point, the text CAPTCHAs became so hard for humans and so unusable, whereas computers were really good at breaking them where it just didn't make sense anymore to have them. So it was necessary to switch to different design. And Google actually released version 2 of reCAPTCHA in, like, late 2014. And the whole idea there was that the challenge are now going to be much easier because they will ask you to, like, identify images with cats or dogs or, you know, cups of coffee.

However, once we did our research when that system first came out and we showed that deep learning systems were already at a point where they could, let's say, return keywords that describe the content of images, and we showed how you could misuse those to automatically solve reCAPTCHA's challenges, then again, they were in a spot where they would - where they needed to make these systems even harder, the challenges harder to prevent, you know, machine learning systems from solving them.

CORNISH: In the end, people may look at this as a kind of nuisance. For them, it's a little bit like coming to a stoplight, and they're traveling the Web. But what are the implications, you think, for computer science more broadly?

POLAKIS: I definitely agree that it's a nuisance for many people, but we need to keep in mind that both from the security perspective and the improving machine learning perspective, it's something that's positive for all of us. When a website can actually use CAPTCHAs to prevent automated attacks, that means that as users, we're going to be affected less.

And on the other hand, there been multiple attempts by major companies and researchers to actually harness, let's say, the feedback from users to actually improve their systems. So, for example, anything that has to do with computer vision or, like, autonomous vehicles - the fact that we can provide feedback through the CAPTCHA system, which can allow companies to retrain their classifiers and have more accurate models - that's also beneficial to all of us.

CORNISH: Jason Polakis, thank you so much for speaking with us and for explaining it.

POLAKIS: Thanks for having me on the show.

CORNISH: Jason Polakis is an assistant professor of computer science at the University of Illinois at Chicago. Transcript provided by NPR, Copyright NPR.