Email Security

Defending AI in the Adversarial Environment

AI — in cybersecurity and in general — is vulnerable to adversarial attacks. Here’s why, plus some defense strategies to increase your cyber resilience.

by Stephanie Overby

Oct 23, 2020

Key Points

Organizations and their cybersecurity functions are increasingly deploying AI to improve efficiency, effectiveness and decision making.
These AI systems are vulnerable to attacks designed to corrupt their models and outputs.
Cybersecurity leaders can employ techniques, such as adversarial training, to make their AI models more resilient to attacks.

Cybersecurity leaders are increasingly encouraging organizations to deploy AI to enhance their cyber resilience in the face of AI-wielding cybercriminals using intelligent capabilities to boost ransomware, email phishing scams and other attacks. But what happens when the AI itself is hacked?

That’s the latest threat cybersecurity teams must consider as they employ AI for their own cyber defenses and seek to protect the growing number of AI-enabled technologies and processes in their organizations. “If you use machine learning for any security-related function, you will be vulnerable to these attacks,” says Dr. Herbert Roitblat, Principal Data Scientist for Mimecast and a recognized AI expert.

Machine learning is dependent on data to learn, and that creates a clear target for corrupting AI models. Machine learning works by learning statistical associations or patterns which, it turns out, are relatively easy to disrupt. As a Harvard research paper explains, “Unlike humans, machine learning models have no baseline knowledge that they can leverage—their entire knowledge depends wholly on the data they see. Poisoning the data poisons the AI system.”[1]

Malevolent Machine Learning: AI Under Cyberattack

As AI matures and is adopted more widely in the enterprise, bad actors naturally are employing techniques like data poisoning and adversarial examples (inputs to machine learning models that an attacker has intentionally designed to cause the model to make a mistake) to infect these systems and influence their output.[2],[3] A very public case in point: It took less than a day for Twitter users to corrupt Tay, the AI chatbot Microsoft introduced in 2016. After ingesting a stream of racist and misogynistic tweets, the bot itself began posting inflammatory and offensive tweets in return.[4]

While the fallout for Microsoft was reputational, potential adversarial attacks on AI could cause other, more serious harm, as a recent article in the high performance computing journal datanami pointed out: “A military drone misidentifies enemy tanks as friendlies. A self-driving car swerves into oncoming traffic. An NLP bot gives an erroneous summary of an intercepted wire. These are examples of how AI systems can be hacked, which is an area of increased focus for government and industrial leaders alike.”[5]

Indeed, just last year hackers manipulated the intelligent systems on board a Tesla Model S resulting in, among other things, sending the car into oncoming traffic.[6] The MIT Technology Review explained: “This is an example of an ‘adversarial attack,’ a way of manipulating a machine-learning model by feeding in a specially crafted input. Adversarial attacks could become more common as machine learning is used more widely, especially in areas like network security.”[7]

Fighting Fire with Fire to Make AI in Cybersecurity More Resilient

Experts expect the use of advanced AI in both cyberattacks and companies’ cybersecurity efforts to rise significantly in 2020. One underlying reason for the rapid growth of AI-oriented cyberattacks is the open source nature of so much AI research and development—and cybercriminals’ ability to adopt new tech faster than many large organizations.

Bob Adams, a Mimecast security strategist, explained it this way in a 2020 predictions article for VMblog.com: “Organizations aiming to implement AI will have to spend time monitoring their tool's effectiveness and accuracy, while attackers have incredible amounts of information gathered through open-source intelligence. The balance between those trying to defend versus attack is a digital battlefield that will see a continually shifting landscape."

Adversarial AI “is a big problem,” UC Berkeley professor Dawn Song said last year at an MIT Technology Review event urging industry leaders to come together to fix it.[8] DARPA launched the Guaranteeing AI Robustness against Deception (GARD) program in 2019 to identify vulnerabilities, bolster AI robustness and build defensiveness mechanisms that are resilient to AI hacks.[9]

On the local level, there are some open source approaches cybersecurity leaders can take advantage of to make AI models more resilient to attacks. The field of adversarial training has emerged to study vulnerabilities of machine learning models and algorithms, and make them secure against adversarial manipulation. Adversarial training attempts to improve an AI model’s cyber resilience by incorporating adversarial samples into its training.[10] The theory is that training an AI model with your own adversarial data sets will enable it to classify — and learn to ignore — adversarial data in the real world.

What You Can Do to Enhance Resilience of Cybersecurity AI

“The more an adversary knows about what you do, the more they can emulate it,” Roitblat says. That thinking informs a number of actions cybersecurity leaders can take when developing AI systems to better protect them from attacks, including:

Changing the input features of open source data sets (e.g., low and high frequency, multiple sensors, novel features).
Randomizing inputs (e.g., jitter, scale, rotation, add noise).
Avoiding commonly used libraries.
Training on unique data.
Analyzing patterns of errors in training data.

The Bottom Line

As AI becomes ever-more-important and ingrained in organizations and their cybersecurity systems, it becomes a bigger target for cyberattack. Cybersecurity leaders must understand the potential for adversarial attacks on AI systems they use and explore approaches, such as adversarial training, to combat this growing threat.

[1] Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It, Harvard Kennedy School’s Belfer Center for Science and International Affairs

[2] “Adversarial Machine Learning at Scale,” Google and Université de Montréal

[3] “Attacking Machine Learning with Adversarial Examples,” OpenAI Blog

[4] “Twitter taught Microsoft’s AI chatbot to be a racist asshole in less than a day,” The Verge

[5] “Hacking AI: Exposing Vulnerabilities in Machine Learning,” Datanami

[6] Experimental Security Research of Tesla Autopilot, Tencent Keen Security Lab

[7] “Hackers trick a Tesla into veering into the wrong lane,” MIT Technology Review

[8] “How malevolent machine learning could derail AI,” MIT Technology Review

[9] Guaranteeing AI Robustness Against Deception (GARD), Dr. Bruce Draper, Defense Advanced Research Projects Agency

[10] “Adversarial Attacks and Defenses in Deep Learning,” Engineering