A new intelligent system for malicious URLs detection
Hayder Raeed Hekmat AL-Shawk (),
Ibrahim M. El-Hasnony () and
Hazem M. El-Bakry ()
Edelweiss Applied Science and Technology, 2025, vol. 9, issue 2, 1374-1390
Abstract:
In cybersecurity, recognizing and mitigating malicious URLs represents paramount challenges due to their various cyber threats, including phishing, malware distribution, and fraud. This paper aims to create a URL detection system that employs machine learning and data mining methods. The proposed system comprises several steps: data acquisition, preprocessing, feature selection, URL tokenization, and classification. First, we acquire a recent dataset containing both malicious URLs and normal ones and 87 numerical features. The features are preprocessed by scaling them using a standard scaler to prevent the model from being biased towards certain features. Furthermore, Fick's Law metaheuristic optimization algorithm (FLA) is used for feature selection, utilizing the Light Gradient Boosting Machines (LGBM) accuracy as a fitness function for the algorithm, resulting in a 50% feature reduction. The URLs are tokenized using Bidirectional Encoder Representations from Transformers (BERT) and converted to a feature vector. The combined BERT feature vector and FLA-selected features are input for the Categorical Boosting (CatBoost) classifier, achieving 96.59% accuracy, 96.75% precision, 96.41% recall, and 96.58% F1-score. The system surpasses all other machine learning and deep learning methodologies in its validation. Additionally, the proposed system outperformed the results of previous studies that utilized the same dataset. The proposed system is an effective and efficient approach for detecting malicious URLs, safeguarding digital assets, and ensuring the integrity of online environments.
Keywords: BERT; CatBoost; Fick’s Law Algorithm; LGBM; Malicious URL Detection. (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://learning-gate.com/index.php/2576-8484/article/view/4650/1807 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ajp:edwast:v:9:y:2025:i:2:p:1374-1390:id:4650
Access Statistics for this article
More articles in Edelweiss Applied Science and Technology from Learning Gate
Bibliographic data for series maintained by Melissa Fernandes ().