Threat analysis and model developed in SPARTA SAFAIR Program
The current adoption of AI in computer-based systems and indications of its future ubiquitous presence increase the need for identification of threats against AI systems, such as attacks to availability of those systems or to data integrity. The SPARTA SAFAIR program aims to ensure trustworthiness of artificial intelligence (AI) systems, including security, privacy, and reliability. For a better comprehension and knowledge of the threats, the first task of the program is to design and develop threat analysis tools to provide support to the risk assessment and management in AI.
First, there was the need to identify potential security and privacy incidents that may occur to AI systems and attacks against AI components in systems that need to be defended. Thus, in SPARTA SAFAIR program, the first task was to collect all available attacks in the state of the art and to generate an adequate taxonomy to uniquely identify them and general enough to introduce future ones in the system.
The analysis of potential threats against AI systems would start from the analysis of the system constituent parts and assets to protect, identifying their vulnerabilities and the possible ways these components could unintentionally fail or suffer attacks from malicious adversaries. Some elements of the taxonomy are described next.
The target AI assets are a combination of the ones used in software engineering of AI systems and concepts found in machine learning attack literature. Starting from the raw data and its characteristics, passing through the data used for model generation (training and test data) and the trained model itself and finally, the assets that appear when the model is operating (operational data, benchmark data, model testing tool, runtime model monitoring tool and inference results).
To these assets four AI attack tactic groups have been identified:
- Data access: with these data access the attacker can generate a subrogate model that imitates the operating model behaviour and use it for the generation of other types of attacks and to test their efficacy before submitting them [1].
- Poisoning: this type of attacks is used to alter directly or indirectly the operational data or the model of the AI system [2].
- Evasion: using this type of methods the adversary can find model vulnerabilities in such a way that using handcrafted input data, they are able to generate operational problems or get desired outputs from the models [3].
- Oracle: the attacker obtains information of the model such as its structure or architecture, parameters used for its generation or even data used in the training phase [4].
On the bases of this risk assessment model, SPARTA SAFAIR program has developed multiple defences against some of the attacks and are being validated. This defences would help to detect or avoid attacks or reducing operating model’s vulnerabilities or at least knowing them. Following article
[1] N. Papernot, P. Mcdaniel, A. Sinha, and M. P. Wellman, “SoK : Security and Privacy in Machine Learning,” 2018 IEEE Eur. Symp. Secur. Priv., pp. 399–414, 2018.
[2] B. Biggio, B. Nelson, and P. Laskov, “Poisoning Attacks against Support Vector Machines,” 2012.
[3] N. Papernot, P. Mcdaniel, and I. Goodfellow, “Transferability in Machine Learning : from Phenomena to Black-Box Attacks using Adversarial Samples.”
[4] F. Tramèr, F. Zhang, A. Juels, M. K. Reiter, and T. Ristenpart, “Stealing Machine Learning Models via Prediction APIs,” no. September, 2016.