What you need to know about protecting your organisation from bad bots

A bot is simply any automated tool or script that’s designed to perform a specific task. Some are good, some are bad. The primary task of a bot today is to perform automated tasks faster than humans ever could.

Bots can be programmed to do anything a human can do on a computer, as long as that task has specific, logical and often repeatable steps. They are the key perpetrators of web content scraping, brute force attacks, competitive data mining, account hijacking, unauthorised vulnerability scans, man-in-the-middle attacks, spam, click fraud, and pretty much anything else that has a negative impact on a website’s success.

With so many companies relying on their websites as their ‘window to the world’ in terms of brand, awareness and doing business with customers, this has a financial effect. The cost is there in terms of lost business as well as having to invest in additional web server infrastructure to deal with the bad bot traffic.

Good bots vs bad bots

Good bots – automated agents deployed by search engines such as Googlebot or Bingbot – are essential to the operation of the web as we know it today and represent 36 per cent of web traffic. They crawl online sites to build catalogues of information that can then be used by people around the world. A vast majority of purchases start with Internet research, so the activities that these bots undertake provides a greater good to companies everywhere.

Bad bots use the same capabilities as good bots but differ in the ways they are employed and the end results for the data they gather. They frequently employ a variety of measures to avoid detection as they are used in ways that would normally be damaging to the sites they visit or crawl.

They currently make up more than 23% of all traffic on the web and over 8 per cent of mobile web traffic. To put it all in perspective, almost 60 per cent of your web infrastructure costs could be supporting non-human visitors.

Simple, average and sophisticated bots

Bad bots are getting more sophisticated. The 2015 Bad Bot Report found that 41% of bad bots attempted to enter a website’s infrastructure disguised as legitimate human traffic; worse, 7% were sophisticated enough to disguise themselves as good bots. The bad bot market is also becoming more equal-opportunity, as the number of countries originating at least 1% of worldwide bot traffic has almost doubled in the past year.

The simplest of bad bots – some 47% of those we saw last year – are most effective at grabbing the entire source code of a web page using common scripting languages. These bots tend to be easily detected through their use of bad user agents or failure to pass basic browser integrity checks.

Average bad bots – 30% of the 2014 total – are often wrapped in components of web browsers, allowing them to parse and interpret the contents of web pages using JavaScript or another control language. This makes them more dangerous to online businesses than the simple bad bots mentioned above, but they can be trapped by forcing them to prove they’re using a real web browser.

The most sophisticated, and fastest-growing, category of bad bots use browser automation tools like Selenium and PhantomJS to replicate human browsing behavior.

Different site sizes and industries have different vulnerabilities

Small websites, not unsurprisingly, are more vulnerable to the performance hits and brownouts that accompany high levels of bad bot traffic (32 per cent). They tend to have less-robust infrastructure and security.

Digital publishers proved to be the most attractive to bad bots with 32 per cent of bad bot traffic directed at them. The reason for this is that their sites are full of public-facing intellectual property in the form of articles, research and other content. Scraping this content and re-using it across the web is a visible sign of bad bots in action.

The hypercompetitive world of online travel booking and promotion came in second, attracting almost 28% of bad bot traffic. Here, the emphasis is on tracking changes and responding to other sites around pricing to try and win customers. For the e-commerce sites, this can have a painful effect on business levels and margins.

Cleaning up your site traffic

There are some simple steps any business can take to reduce the impact of bad bots on their sites.

Know what traffic on your site should look like, so you can see when something is out of whack. If your business operates only in specific geographies, you can simply block IP addresses from outside those areas, a technique known as Geo-IP fencing. As you are not selling in that market, then you won’t be losing business in doing this. If you do sell in that market through partners, you can route traffic through to a specific page that directs real prospects to contact the partner instead.

Work with other parts of the business that are active online. For example, this can include making sure your marketing department is auditing ad campaigns for signs of click fraud. Implement CAPTCHAs and other Turing tests to filter out at least some of the non-human traffic.

Recognise what your current web security arsenal can’t do. For example, Web Application Firewalls (WAFs) can catch simple bots using IP blocking, but you’ll need to continually monitor and update the address ranges being blocked to avoid blocking genuine human visitors. WAFs may also pick up some bots through user agent testing, but the increased use of spoofing will render this approach more unreliable in future. Similarly, DDoS protection solutions are not the answer, as they are designed to deal with volume rather than subtlety.

Consider new ways to spot and track bots. Effective bot detection, like a number of other areas of information security, requires the need to differentiate between human and non-human web traffic. There are an array of targeted tools like inline fingerprinting, behavioral modeling, rate limiting, browser automation detection, and comparison with a known violator database can be deployed against the non-human traffic. Implementing some of these approaches can help stop bad bot activity, while supporting good bots and human traffic.

The future of bots

Bots follow where the traffic goes. It’s why Chrome has become the most attractive browser to imitate. It’s why we’re seeing the rapid rise in bots traveling across Chinese mobile carrier networks. It’s why we saw a tenfold increase in the percentage of mobile bad bots last year. The growth around mobile will undoubtedly continue apace and online businesses should ensure that they are doing as much to protect their mobile sites.

Coming up fast behind mobile is the Internet of Things. As more devices join the Internet and pass data between themselves and their companies, bot traffic will change as well. Envisaging a world where bad bots are setting their sights on APIs – web, mobile, and machine-to-machine – is not something most of us want to contemplate. As is evident from the Bad Bot Report, it is imperative that businesses develop their security strategies so that they can carry their operations forward but leave the bad bots in the dust.

Sourced from Rami Essaid, CEO, Distil Networks

Ben Rossi

Ben was Vitesse Media's editorial director, leading content creation and editorial strategy across all Vitesse products, including its market-leading B2B and consumer magazines, websites, research and... More by Ben Rossi

Rise of the bots: what you need to know about protecting your organisation from bad bots

Good bots vs bad bots

Simple, average and sophisticated bots

Different site sizes and industries have different vulnerabilities

Cleaning up your site traffic

The future of bots

Ben Rossi

Related Topics

Related Stories

Fully Homomorphic Encryption (FHE) with silicon photonics – the future of secure computing

DFIR and its role in modern cybersecurity

Will more AI mean more cyberattacks?

Is RaaS becoming commoditised?

Related Stories

Fully Homomorphic Encryption (FHE) with silicon photonics – the future of secure computing

DFIR and its role in modern cybersecurity

Will more AI mean more cyberattacks?

Cutting the cord: Can Air-Gapping protect your data?