Human-aided Bot Operation

Published on Sep 23, 2019

We’d like to think that we can outsmart computers. If we’re trying to defend a website against bots, we’d like to be able to include a CAPTCHA and call it a day. Sure, we might need to contort some letters or sprinkle some lines and dots in the background, but ultimately, we feel pretty confident that we can come up with a sufficient roadblock that stops bots while letting humans through. Unless -- and there’s a faint worry in the back of our minds -- machine learning has progressed far enough that maybe bots can actually defeat our twisted text CAPTCHAs.

It turns out that it’s actually not that difficult to come up with alternate CAPTCHAs that don’t involve users deciphering mangled text. Unless those particular CAPTCHAs are very popular, it’s very unlikely that there is an automated way of solving them. At the same time, it’s easy to become overconfident as illustrated by this XKCD comic: the vulnerability is not so much that the bot can overcome the CAPTCHA, but rather that the bot can simply outsource CAPTCHA work to humans. There are a number of CAPTCHA-breaking networks that pay humans very small amounts of money to solve a CAPTCHA challenge. Consider the varying levels of cooperation between bots and humans:

Level 1: Naive bot, no human interaction

Some scrapers do not account for CAPTCHAs.

Countermeasure: Any CAPTCHA will be effective in this trivial case.

Level 2: Bot with CAPTCHA-breaking capabilities

Some CAPTCHAs (e.g. distorted text or math problems) have been around long enough that libraries have been developed to circumvent those CAPTCHAs. Bots can use those libraries and be programmed to fill in specific fields to answer CAPTCHA challenges.

Countermeasure: A sufficiently non-standard CAPTCHA should be effective until the CAPTCHA-breaking software improves.

Level 3: Bots requiring human configuration but not ongoing human interaction

There are some simple, but non-standard CAPTCHAs that can be effective in reducing bot access. For example, there are honeypot CAPTCHAs that present a field that is only visible to bots (e.g. browsers with JavaScript disabled or browsers that do not properly handle CSS). If the honeypot field is filled in, then the site deduces that a bot has filled out the form. Alternatively, another simple CAPTCHA is to simply ask the user to type in a specific word in a specific field. Bots will not understand the instruction and will not fill out the field properly, allowing themselves to be identified. Some bot operators can overcome this situation by configuring their software to ignore certain fields or to fill out certain fields in a certain way.

Countermeasure: Websites might be able to counter these bots by adding sufficient variety. For example, they might use randomly generated values for the id of the CAPTCHA field. Or, they could change the special word on each page load. These strategies are still suspect because a sufficiently configurable bot could be programmed to read the CAPTCHA instructions (which are meant to be easy for humans to understand) and act accordingly. More effective would be stronger CAPTCHAs that require more user interaction.

Level 4: Bots that use remote humans on an ongoing basis

Suppose you come up with a CAPTCHA that is impossible for computers to automatically crack. Perhaps you have discovered some novel way to transform characters. Or perhaps you render graphically instructions to type a specific word so that computers cannot easily read the text, and even if they do, they need to figure out which word is the magic keyword and you have taken care to vary the instructions so that the keyword is not always the last word. Or perhaps you have a very large collection of questions and answers that can be found on the internet, but which bots cannot easily answer. Here, bots do not need to become better; they can simply send the CAPTCHA challenge to humans to solve relatively cheaply. Humans can load the image or text and send back the answer.

Countermeasure: There are multiple methods of countering this scenario. One possibility is to require that the user’s browser support JavaScript, and have the browser perform some trivial calculation that can be verified by the server. For example, JavaScript might append the timestamp of submission, which the server can verify. While humans can supply the correct solution from the image, that solution lacks the required JavaScript component, and hence, the CAPTCHA would be effective. Another JavaScript solution would be to require mouse movement, such as the Pick Shape CAPTCHA. Unfortunately, some security features (e.g. requiring JavaScript and mouse movements) come at the cost of making the site less accessible (e.g. to users who are unable to use mice). We recognize that the world is full of tradeoffs, and our design goal has been to offer a range of different CAPTCHA types and options and allow clients to configure the Shibboleth service to meet their specific needs.

Level 5: The bot and the human are one: full browser emulation

The bot might be a full-fledged browser and when it detects a CAPTCHA element, it waits for a human to interactively control it to solve the CAPTCHA (similar to how screensharing software allows clients to take control of a desktop or an application). All human input is captured and emulated.

Countermeasure: It seems unlikely that there is an effective obstacle to this behavior for public sites relying only on CAPTCHA, since this would be virtually indistinguishable from humans browsing the site. However, this type of arrangement would require fairly sophisticated software and would likely cost much more in human labor. One consolation is that as illustrated with the bear joke, you don’t need to make your website completely impervious to bots, only to be difficult enough that it would be less expensive to just contact you and work with you directly.

To that end, one possible mitigation is to limit the time allowed to solve the CAPTCHA, making it more expensive to scrape (because a human needs to be nearby). This method is likely defeated by humans simply refreshing the page when they are ready to solve the CAPTCHA. At which point you can track the number of unsolved CAPTCHAs… and the arms race continues.

At NetToolKit, we have spent time thinking through these vulnerabilities and have tried to engineer protections against various threats. We believe Shibboleth can handle any level of bot-human integration except for full-browser emulation (we’re working on another service for that). If you have any notes to add or ideas for new and fun CAPTCHAs, please let us know.