The terms machine learning and Artificial Intelligence (AI) cover a broad field originating from statistics and operations research. At its core, AI is a method for predicting outcomes from sets of data. It does this by creating models that automatically infer patterns in the data and use those patterns to make decisions. AI is quickly becoming a critical technology for the digital society and industry. We increasingly depend on AI’s ability to learn from past experiences, to reason, to discover meaning, or classify complex data to make critical decisions, and to automate processes and decision making.
AI pervasiveness gives rise to «Adversarial Artificial intelligence (AAI)» where attackers (A) exploit AI to craft attacks to compromise AI models in use, and (B) use AI to scale and automate elements of attacks that previously were simply impossible (DeepFakes) or relied heavily on manual processes.
AAI causes machine learning models to misinterpret inputs and behave in a way that’s favourable to the attacker. To compromise a models’ behaviour, attackers create «adversarial examples / data» that often resemble normal inputs, but instead break the model’s performance. AI models then mis-classify adversarial examples to output incorrect answers with high confidence.
Cheap computational power and abundance of collected data have allowed modellers and attackers to develop increasingly complex AI models at low cost. As the accuracy and complexity of AI models continued to grow, many behaviours they capture defy any comprehensive human understanding. Most of these AI models have become black boxes. If an attacker can determine a particular behaviour in an AI model that is unknown to its developers, they can exploit that behaviour for potential gain.
Several AI models, including state-of-the-art neural networks, are vulnerable to adversarial examples. That is, these models misclassify examples that are only slightly different (imperceptibly different for humans) from correctly classified examples.
The vulnerability to AAI becomes one of the major risks for applying AI in safety or security critical environments. Attacks against core technologies like computer vision, optical character recognition (OCS), natural language processing (NLP), voice and video (DeepFakes), and malware detection have already been demonstrated.
AAI threat examples include:
AAI targets areas of the attack surface we never previously secured, the AI models themselves. Organizations need to include their AI models and AI driven automation and decision making in their risk assessment. Defending against AAI encompasses proactive and reactive strategies. Proactive strategies make AI models more robust against adversarial examples while reactive strategies aim to detect adversarial examples when the AI model is in use.
We need to appreciate and develop an understanding of this new and evolving threat environment and increasingly challenge processes driven by automated AI decision making
Explaining and Harnessing Adversarial Examples
What is adversarial artificial intelligence and why does it matter?