Interpretable Machine Learning Models for Malicious Domains Detection Using Explainable Artificial Intelligence (XAI)

Aslam, Nida; Khan, Irfan Ullah; Mirza, Samiha; AlOwayed, Alanoud; Anis, Fatima M.; Aljuaid, Reef M.; Baageel, Reham

Interpretable Machine Learning Models for Malicious Domains Detection Using Explainable Artificial Intelligence (XAI)

Nida Aslam, Irfan Ullah Khan, Samiha Mirza, Alanoud AlOwayed, Fatima M. Anis, Reef M. Aljuaid and Reham Baageel
Additional contact information
Nida Aslam: SAUDI ARAMCO Cybersecurity Chair, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam 31441, Saudi Arabia
Irfan Ullah Khan: Department of Computer Science, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, Dammam 31441, Saudi Arabia
Samiha Mirza: Department of Computer Science, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, Dammam 31441, Saudi Arabia
Alanoud AlOwayed: Department of Computer Science, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, Dammam 31441, Saudi Arabia
Fatima M. Anis: Department of Computer Science, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, Dammam 31441, Saudi Arabia
Reef M. Aljuaid: Department of Computer Science, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, Dammam 31441, Saudi Arabia
Reham Baageel: Department of Computer Science, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, Dammam 31441, Saudi Arabia

Sustainability, 2022, vol. 14, issue 12, 1-22

Abstract: With the expansion of the internet, a major threat has emerged involving the spread of malicious domains intended by attackers to perform illegal activities aiming to target governments, violating privacy of organizations, and even manipulating everyday users. Therefore, detecting these harmful domains is necessary to combat the growing network attacks. Machine Learning (ML) models have shown significant outcomes towards the detection of malicious domains. However, the “black box” nature of the complex ML models obstructs their wide-ranging acceptance in some of the fields. The emergence of Explainable Artificial Intelligence (XAI) has successfully incorporated the interpretability and explicability in the complex models. Furthermore, the post hoc XAI model has enabled the interpretability without affecting the performance of the models. This study aimed to propose an Explainable Artificial Intelligence (XAI) model to detect malicious domains on a recent dataset containing 45,000 samples of malicious and non-malicious domains. In the current study, initially several interpretable ML models, such as Decision Tree (DT) and Naïve Bayes (NB), and black box ensemble models, such as Random Forest (RF), Extreme Gradient Boosting (XGB), AdaBoost (AB), and Cat Boost (CB) algorithms, were implemented and found that XGB outperformed the other classifiers. Furthermore, the post hoc XAI global surrogate model (Shapley additive explanations) and local surrogate LIME were used to generate the explanation of the XGB prediction. Two sets of experiments were performed; initially the model was executed using a preprocessed dataset and later with selected features using the Sequential Forward Feature selection algorithm. The results demonstrate that ML algorithms were able to distinguish benign and malicious domains with overall accuracy ranging from 0.8479 to 0.9856. The ensemble classifier XGB achieved the highest result, with an AUC and accuracy of 0.9991 and 0.9856, respectively, before the feature selection algorithm, while there was an AUC of 0.999 and accuracy of 0.9818 after the feature selection algorithm. The proposed model outperformed the benchmark study.

Keywords: network security; malicious domains; machine learning; ensemble models; explainable artificial intelligence (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
https://www.mdpi.com/2071-1050/14/12/7375/pdf (application/pdf)
https://www.mdpi.com/2071-1050/14/12/7375/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:14:y:2022:i:12:p:7375-:d:840446

Access Statistics for this article

Sustainability is currently edited by Ms. Alexandra Wu

More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().