The Improved Network Intrusion Detection Techniques Using the Feature Engineering Approach with Boosting Classifiers

Rai, Hari Mohan; Yoo, Joon; Agarwal, Saurabh

The Improved Network Intrusion Detection Techniques Using the Feature Engineering Approach with Boosting Classifiers

Hari Mohan Rai, Joon Yoo () and Saurabh Agarwal ()
Additional contact information
Hari Mohan Rai: School of Computing, Gachon University, 1342 Seongnam-daero, Sujeong-gu, Seongnam 13120, Republic of Korea
Joon Yoo: School of Computing, Gachon University, 1342 Seongnam-daero, Sujeong-gu, Seongnam 13120, Republic of Korea
Saurabh Agarwal: Department of Information and Communication Engineering, Yeungnam University, Gyeongsan 38541, Republic of Korea

Mathematics, 2024, vol. 12, issue 24, 1-35

Abstract: In the domain of cybersecurity, cyber threats targeting network devices are very crucial. Because of the exponential growth of wireless devices, such as smartphones and portable devices, cyber risks are becoming increasingly frequent and common with the emergence of new types of threats. This makes the automatic and accurate detection of network-based intrusion very essential. In this work, we propose a network-based intrusion detection system utilizing the comprehensive feature engineering approach combined with boosting machine-learning (ML) models. A TCP/IP-based dataset with 25,192 data samples from different protocols has been utilized in our work. To improve the dataset, we used preprocessing methods such as label encoding, correlation analysis, custom label encoding, and iterative label encoding. To improve the model’s accuracy for prediction, we then used a unique feature engineering methodology that included novel feature scaling and random forest-based feature selection techniques. We used three conventional models (NB, LR, and SVC) and four boosting classifiers (CatBoostGBM, LightGBM, HistGradientBoosting, and XGBoost) for classification. The 10-fold cross-validation methods were employed to train each model. After an assessment using numerous metrics, the best-performing model emerged as XGBoost. With mean metric values of 99.54 ± 0.0007 for accuracy, 99.53 ± 0.0013 for precision, 99.54 ± 0.001 for recall, and an F1-score of 99.53 ± 0.0014, the XGBoost model produced the best performance overall. Additionally, we showed the ROC curve for evaluating the model, which demonstrated that all boosting classifiers obtained a perfect AUC value of one. Our suggested methodologies show effectiveness and accuracy in detecting network intrusions, setting the stage for the model to be used in real time. Our method provides a strong defensive measure against malicious intrusions into network infrastructures while cyber threats keep varying.

Keywords: network intrusion detection; machine learning; XGBoost; feature engineering; cyber security; threat detection (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://www.mdpi.com/2227-7390/12/24/3909/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/24/3909/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:24:p:3909-:d:1541719

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().