Detecting Emerging DGA Malware in Federated Environments via Variational Autoencoder-Based Clustering and Resource-Aware Client Selection

Duc, Ma Viet; Dang, Pham Minh; Phuong, Tran Thu; Truong, Truong Duc; Hai, Vu; Thanh, Nguyen Huu

Detecting Emerging DGA Malware in Federated Environments via Variational Autoencoder-Based Clustering and Resource-Aware Client Selection

Ma Viet Duc, Pham Minh Dang, Tran Thu Phuong, Truong Duc Truong, Vu Hai and Nguyen Huu Thanh ()
Additional contact information
Ma Viet Duc: School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi 100000, Vietnam
Pham Minh Dang: School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi 100000, Vietnam
Tran Thu Phuong: School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi 100000, Vietnam
Truong Duc Truong: School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi 100000, Vietnam
Vu Hai: School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi 100000, Vietnam
Nguyen Huu Thanh: School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi 100000, Vietnam

Future Internet, 2025, vol. 17, issue 7, 1-25

Abstract: Domain Generation Algorithms (DGAs) remain a persistent technique used by modern malware to establish stealthy command-and-control (C&C) channels, thereby evading traditional blacklist-based defenses. Detecting such evolving threats is especially challenging in decentralized environments where raw traffic data cannot be aggregated due to privacy or policy constraints. To address this, we present FedSAGE, a security-aware federated intrusion detection framework that combines Variational Autoencoder (VAE)-based latent representation learning with unsupervised clustering and resource-efficient client selection. Each client encodes its local domain traffic into a semantic latent space using a shared, pre-trained VAE trained solely on benign domains. These embeddings are clustered via affinity propagation to group clients with similar data distributions and identify outliers indicative of novel threats without requiring any labeled DGA samples. Within each cluster, FedSAGE selects only the fastest clients for training, balancing computational constraints with threat visibility. Experimental results from the multi-zones DGA dataset show that FedSAGE improves detection accuracy by up to 11.6% and reduces energy consumption by up to 93.8% compared to standard FedAvg under non-IID conditions. Notably, the latent clustering perfectly recovers ground-truth DGA family zones, enabling effective anomaly detection in a fully unsupervised manner while remaining privacy-preserving. These foundations demonstrate that FedSAGE is a practical and lightweight approach for decentralized detection of evasive malware, offering a viable solution for secure and adaptive defense in resource-constrained edge environments.

Keywords: network security; intrusion detection systems; federated learning; DGA detection (search for similar items in EconPapers)
JEL-codes: O3 (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/1999-5903/17/7/299/pdf (application/pdf)
https://www.mdpi.com/1999-5903/17/7/299/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jftint:v:17:y:2025:i:7:p:299-:d:1693841

Access Statistics for this article

Future Internet is currently edited by Ms. Grace You

More articles in Future Internet from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().