The advances in artificial intelligence (AI), concretely in machine learning (ML), have opened the implementation of it in many domains and applications. Among these domains, several critical fields can be found, such as healthcare. Malicious users, called adversaries, could modify the behaviour of these AI systems, obtaining undesired results and even having impact on the society. Thus, the cybersecurity has put attention to the security threats in AI. The SPARTA SAFAIR program seeks to develop defences against attacks to protect AI systems.
To defend the system, it is necessary to know to what kind it is vulnerable. Therefore, the first step is to study the different attacks against AI systems. Among others, there is a widely known attack named adversarial attack, which is one of the most problematic attacks that can suffer a ML model [1].
Let us imagine that a ML model helps in a hospital to diagnose breast cancer. This model could help the doctor to detect from images if the patient suffers from breast tissue cancer. However, suppose that an adversary can modify the images that the model diagnoses. These image modifications could be minimal, being imperceptible to the human eye, but sufficient to confuse the ML model and resulting in a misdiagnose to a patient. Can this situation be real? [2]. This specific attack is the aforementioned adversarial attack, and it has brought and still brings many problems to researchers.
Once the attack is detected and studied, the countermeasures must be conceived and developed. In particular, several countermeasures against adversarial attack have been proposed in SAFAIR program, including adversarial train [3], dimensionality reduction [4], prediction similarity [2] and a detector based on activations.
Those defences work, to a greater or lesser extent, to prevent adversarial attacks. Each of them focuses on a different aspect of the ML model or process of the attack to combat or detect them. For example, the prediction similarity focuses on detecting the process that is carried out to obtain adversarial images, while the activation detector focuses on the activations of the model to be defended to detect strange behaviour generated by adversarial examples.
SPARTA SAFAIR program has organised a contest to test these defences and other attacks and countermeasures. This contest is set up as a two-player game in which models and attacks are continuously pitted against each other. This encourages a co-evolution in which attacks can progressively adapt to the defence strategies of the models, and in which the models can progressively learn to better defend against strong attacks. The contest, which will start in the first of March, will soon be advertised. Be ready!
Cybersecurity is increasingly important in various fields and the field of ML is no exception because the challenge of cybersecurity is to protect humans from cyberattacks and ML models seem to be part of our lives.
For more details on the defenses against AI systems and the work done inside SAFAIR, you can consult the deliverable 7.2 of SPARTA: Preliminary description of AI systems security mechanisms and tools.
[1] N. Akhtar and A. Mian, “Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey,” IEEE Access, vol. 6, no. AUGUST, pp. 14410–14430, 2018.
[2] X. Echeberria-Barrio, A. Gil-Lerchundi, I. Goicoechea-Telleria, and R. Orduna-Urrutia, “Deep learning defenses against adversarial examples for dynamic risk assessment,” CISIS Conf., 2020.
[3] X. Yuan, P. He, Q. Zhu, and X. Li, “Adversarial Examples: Attacks and Defenses for Deep Learning,” IEEE Trans. neural networks Learn. Syst., vol. 30, no. 9, pp. 2805–2824, 2019.
[4] R. Sahay, R. Mahfuz, and A. El Gamal, “Combatting Adversarial Attacks through Denoising and Dimensionality Reduction: A Cascaded Autoencoder Approach,” 2019 53rd Annu. Conf. Inf. Sci. Syst. CISS 2019, 2019.