A distributed framework for zero-day malware detection using federated ensemble models

Ishfaq, Hassan; Shah, Jamal Hussain; Saleem, Rabia; Afzal, Maira

A distributed framework for zero-day malware detection using federated ensemble models

Hassan Ishfaq, Jamal Hussain Shah, Rabia Saleem and Maira Afzal

PLOS ONE, 2026, vol. 21, issue 1, 1-27

Abstract: Classification and detection of zero-day attacks remain a significant challenge within the domain of cybersecurity. Due to the vast types of malware families and the presence of an imbalanced dataset, real-time detection and classification become increasingly complex and inaccurate. Thus, there’s an urgent need to develop an intelligent and adaptive defense mechanism capable of identifying and classifying such attacks with improved precision and robustness. This paper proposed a stacked ensemble federated learning model with an accuracy-aware node weighting scheme to address the challenges posed by inter- and intra-class similarities among different types of malwares. In the initial phase, malware Portable Executable (PE) files are collected from multiple online repositories and validated by three different antivirus programs through VirusTotal to ensure reliability. These validated files are then converted into image form and categorized into 28 families to facilitate feature extraction. In the second phase, deep feature representations are extracted through a transfer learning-based fine-tuned ResNet-50 model, which captures both low-level and high-level patterns that are relevant to malware classification. After feature extraction from multiple distributed nodes, architecture is fed into the novel proposed Ensemble Stacked Federated Model for enhanced generalization and robust classification. The model is tested on both private and publicly available datasets. The experimental results demonstrate that the proposed method outperforms existing baseline approaches in terms of accuracy and computational efficiency. This improvement is achieved because it performs independent training at each federated node separately and then stacks their outputs with a central ensemble model, which enhances the learning rate and reduces overfitting. The code used for the experiments is available here.

Date: 2026
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0339907 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 39907&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0339907

DOI: 10.1371/journal.pone.0339907

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().