N-Trans: Parallel Detection Algorithm for DGA Domain Names

Yang, Cheng; Lu, Tianliang; Yan, Shangyi; Zhang, Jianling; Yu, Xingzhan

N-Trans: Parallel Detection Algorithm for DGA Domain Names

Cheng Yang, Tianliang Lu, Shangyi Yan, Jianling Zhang and Xingzhan Yu
Additional contact information
Cheng Yang: College of Information and Cyber Security, People’s Public Security University of China, Beijing 100038, China
Tianliang Lu: College of Information and Cyber Security, People’s Public Security University of China, Beijing 100038, China
Shangyi Yan: College of Information and Cyber Security, People’s Public Security University of China, Beijing 100038, China
Jianling Zhang: College of Information and Cyber Security, People’s Public Security University of China, Beijing 100038, China
Xingzhan Yu: College of Information and Cyber Security, People’s Public Security University of China, Beijing 100038, China

Future Internet, 2022, vol. 14, issue 7, 1-15

Abstract: Domain name generation algorithms are widely used in malware, such as botnet binaries, to generate large sequences of domain names of which some are registered by cybercriminals. Accurate detection of malicious domains can effectively defend against cyber attacks. The detection of such malicious domain names by the use of traditional machine learning algorithms has been explored by many researchers, but still is not perfect. To further improve on this, we propose a novel parallel detection model named N-Trans that is based on the N-gram algorithm with the Transformer model. First, we add flag bits to the first and last positions of the domain name for the parallel combination of the N-gram algorithm and Transformer framework to detect a domain name. The model can effectively extract the letter combination features and capture the position features of letters in the domain name. It can capture features such as the first and last letters in the domain name and the position relationship between letters. In addition, it can accurately distinguish between legitimate and malicious domain names. In the experiment, the dataset is the legal domain name of Alexa and the malicious domain name collected by the 360 Security Lab. The experimental results show that the parallel detection model based on N-gram and Transformer achieves 96.97% accuracy for DGA malicious domain name detection. It can effectively and accurately identify malicious domain names and outperforms the mainstream malicious domain name detection algorithms.

Keywords: malicious domain name; DGA; parallel detection model; N-gram; Transformer model; N-Trans (search for similar items in EconPapers)
JEL-codes: O3 (search for similar items in EconPapers)
Date: 2022
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/1999-5903/14/7/209/pdf (application/pdf)
https://www.mdpi.com/1999-5903/14/7/209/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jftint:v:14:y:2022:i:7:p:209-:d:861976

Access Statistics for this article

Future Internet is currently edited by Ms. Grace You

More articles in Future Internet from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().