Adversarial AI attack detection: a novel approach using explainable AI and deception mechanisms

Niculae, Maria; Suciu, George; Stanescu, Vlad; Sachian, Mari-Anais; Farao, Aristeidis; Sabazioti, Athanasia; Xenakis, Christos; Xenakis, Dionysios; Lacalle, Ignacio; Grammatikis, Panagiotis Radoglou; Brozos, Nikolaos Sachpelidis; Lekka, Zacharenia; Bernardinetti, Giorgio; Tsiota, Anastasia; Kalpaktsoglou, Georgios; Karagiannis, Stylianos

Adversarial AI attack detection: a novel approach using explainable AI and deception mechanisms

Maria Niculae (), George Suciu (), Vlad Stanescu (), Mari-Anais Sachian (), Aristeidis Farao (), Athanasia Sabazioti (), Christos Xenakis (), Dionysios Xenakis (), Ignacio Lacalle (), Panagiotis Radoglou Grammatikis (), Nikolaos Sachpelidis Brozos (), Zacharenia Lekka (), Giorgio Bernardinetti (), Anastasia Tsiota (), Georgios Kalpaktsoglou () and Stylianos Karagiannis ()
Additional contact information
Maria Niculae: Beia Consult International, Bucharest, Romania
George Suciu: Beia Consult International, Bucharest, Romania
Vlad Stanescu: Beia Consult International, Bucharest, Romania
Mari-Anais Sachian: Beia Consult International, Bucharest, Romania
Aristeidis Farao: University of Piraeus, Piraeus, Greece
Athanasia Sabazioti: University of Piraeus, Piraeus, Greece
Christos Xenakis: University of Piraeus, Piraeus, Greece
Dionysios Xenakis: Department of Digital Industry Technologies of the National and Kapodistrian University of Athens, Athens, Greece
Ignacio Lacalle: Universitat Politècnica de València, Valencia, Spain
Panagiotis Radoglou Grammatikis: K3Y, Sofia, Bulgaria
Nikolaos Sachpelidis Brozos: K3Y, Sofia, Bulgaria
Zacharenia Lekka: K3Y, Sofia, Bulgaria
Giorgio Bernardinetti: Consorzio Nazionale Interuniversitario per le Telecomunicazioni, Parma, Italy
Anastasia Tsiota: Fogus Innovations and Services, Athens, Greece
Georgios Kalpaktsoglou: Fogus Innovations and Services, Athens, Greece
Stylianos Karagiannis: PDM, Lisbon, Portugal

Smart Cities International Conference (SCIC) Proceedings, 2024, vol. 12, 623-647

Abstract: Detecting adversarial AI attacks has emerged as a critical issue since AI systems are becoming integral across all industries, from healthcare to finance and even transportation. Adversarial attacks stand on the fact that there exist weaknesses within machine learning and deep learning models, which they exploit on the grounds of their potential to cause serious disruptions and severe threats towards the integrity of AI operational procedures. In this light, the discussion will focus on developing robust mechanisms for detecting adversarial inputs in real-time to ensure that AI systems remain resilient against such sophisticated threats. While adversarial AI — software input sanitization, anomaly detection, and adversarial training — has some important foundational work, most approaches to them suffer from generalization challenges across attack types or real-time performance. This work will introduce novelty by extending the detection capabilities with explainable AI (XAI) and deception mechanisms. Adversarial activities will be detected based on adversarial training in combination with honeypots and digital twins, while keeping the process of detection transparent with XAI. While honeypots and digital twins decoy attackers, observing their behaviors can further strengthen detection methods. The results so-far promise tremendous improvements in the detection of adversarial attacks in high-risk AI applications, efficacy of honeypots for the capture of malicious behavior, and XAI for enhanced interpretability and reliability of the detection process. These techniques will enhance the robustness of AI systems against adversarial threats. Presented research contributes significantly by providing practical tools for cybersecurity professionals and AI practitioners against these attacks, thus offering new insights into AI for cybersecurity. The novelty value of the paper is the innovative integration of adversarial training, XAI, and deception techniques, which offers a combined, interpretable, and effective method toward the detection of adversarial AI attacks on cross-industry sectors.

Keywords: Adversarial AI detection; adversarial training; deception mechanisms; explainable AI Decision-Making; Workforce Development (search for similar items in EconPapers)
JEL-codes: O35 (search for similar items in EconPapers)
Date: 2024
References: Add references at CitEc
Citations:

Downloads: (external link)
https://scrd.eu/index.php/scic/article/view/719/728 (application/pdf)
https://scrd.eu/index.php/scic/article/view/719 (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:pop:procee:v:12:y:2024:623-647

Access Statistics for this article

More articles in Smart Cities International Conference (SCIC) Proceedings from Smart-EDU Hub, Faculty of Public Administration, National University of Political Studies & Public Administration Contact information at EDIRC.
Bibliographic data for series maintained by Professor Catalin Vrabie ().