More than 40 years ago in 1978, a computer vendor in the USA sent the first spam email, but only 20 years later, in the early 2000s, it looked as if spam would finally kill off email altogether. The huge quantities of junk email being generated threatened to overwhelm the world’s inboxes and stifle productivity completely. It was just a stroke of good luck that artificial intelligence (AI) in the shape of machine learning (ML) emerged at around the same time to help combat the onslaught by sifting through massive amounts of data and using it to learn how to recognise different patterns that were a common feature of mass mailings. AI is sometimes used as a catch all term, when in practice most companies are using machine learning which can’t extrapolate new conclusions without new training data. Today, machine learning artificial/intelligence can spot spam, but because of the limits of machine learning, humans need to step in from time to time.
AI is not completely error free of course; there are still false positives and it has to be constantly re-trained to stay one step ahead of the spammers, but it has nevertheless proved to be a faster and more consistent tool than its human equivalent.
How data protection can benefit from artificial intelligence
Artificial intelligence and spam filters
Nowadays the vast majority of spam emails have far less chance of making it into an email user’s inbox because spam filters are constantly evolving to keep up. In their simplest form, simple rules filter out messages with suspect words, which are of course themselves constantly evolving.
Machine Learning allows computers to process data and learn for themselves without being manually programmed. An ML-based spam filter can learn in several ways, but it has to be trained by using a large amount of data from already recognised spam emails and identifying patterns. The ML algorithm then automatically creates a new rule for the spam filter.
Another way to train spam filters with the help of ML is user feedback. If enough users mark emails containing the word “V1agra” as unwanted, then the filter automatically creates a new rule for it.
Hackers often re-use code that already exists in different configuration, because writing malware from scratch takes a lot of time and effort for an attack that might not have a large payoff.
So far so good, but does this mean that Artificial Intelligence can replace humans? The best possible spam filter at the moment still relies on human beings and machines working together, rather than in isolation. Why is this and will the balance of power ever change in favour of AI?
Spam still the most common cyber crime technique, according to recent research
In 1997, IBM’s Deep Blue computer defeated world chess champion Garry Kasparov, the first time a human chess grandmaster had ever been defeated in such a match. Deep Blue employed all of its massive computing power to work out every possible scenario available to it in the three minutes it was allowed for each move. Since the turn of the millennium, that brute-force approach has been applied to the technology of machine learning.
Kasparov and AI: the gulf between perception and reality
Speaking at a recent conference, chess legend, Gary Kasparov, said that the public perception of AI has been overly influenced by Hollywood: the reality is far more positive — Kasparov’s take on AI is a reason for optimism
The human brain nevertheless does have a unique advantage over this artificial brain power — it can solve a problem that it has never seen before. Nothing in machine learning is like that. Everything it does has been designed to deal with a specific problem. Its mastery of any particular task is not transferable — reassuring for those who are worried that computers and robots will one day start thinking for themselves.
Experienced email security experts — or let’s call them ‘spam cops’ — can assess the individual potential of spam emails much more comprehensively than a machine ever can by thinking outside the box.
The experts program algorithms automatically analyse questions such as ‘where and who does the email come from?”, ‘Have there been any malicious communications from this domain reported by other users before?’ and so on, but it then needs human intervention to act on the red flags and dig deeper.
Only humans can identify social trends. There has been a rash of EU GDPR-related phishing scams and sextortion emails, where recipients were threatened with their contacts being sent compromising videos unless a ransom was paid. Only humans know this and can act on them.
Spam or not? Artificial intelligence or human?
In addition to the anti-spam specialists, there is a second human factor in the evaluation of spam: the user.
Take ‘Greymail’ for example. Users currently have an edge over machines when assessing this third category. Called ‘Grey’, because it is neither on the black list of blocked senders or on the user’s whitelist of approved senders, this is email that your spam filter isn’t quite sure what to do with until it’s learned a bit more about it. Some users mark it as spam but others don’t which makes its status ambiguous.
Around 75% of the email messages that people report as spam are actually legitimate newsletters, offers, or notifications that — after initial interest — they just don’t want anymore. Over time, the spam filter will learn what the recipient considers to be ‘greymail’ based on these actions, as well as by the actions of all other recipients of emails sent from that particular domain name.
Machine learning is not real learning argues a new paper
Machine learning is not real learning — machine learning would appear to be a misleading phrase, maybe as a result we need to re-think our understanding of AI; at least these are the implications of a recently published paper
Limits of artificial intelligence spotting spam
However, there are limits to the ability of artificial intelligence to spot spam. This is why human expertise, strategic and creative thinking are indispensable. In addition, hackers are becoming increasingly sophisticated in overcoming defence systems which makes it more difficult to defend against attacks, especially since the attackers also use AI and DL. Machine learning’s most common role, then, is complementary. It acts as a security guard, rather than a cure-all solution.
This is why it is important that ML systems are set up to include humans in the process at various junctures, so that computer systems aren’t making decisions completely on their own. The technology should always have the option of saying ‘this doesn’t look normal’ or ‘I have never seen this before’ and defer to a human for a second opinion.