Classification Spam Email with Elimination of Unsuitable Features with Hybrid of GA-Naive Bayes
O. M. E. Ebadati () and
F. Ahmadzadeh ()
Additional contact information
O. M. E. Ebadati: Department of Mathematics & Computer Science, Kharazmi University, Tehran, Iran
F. Ahmadzadeh: Department of Knowledge Engineering and Decision Science, Kharazmi University, Tehran, Iran
Journal of Information & Knowledge Management (JIKM), 2019, vol. 18, issue 01, 1-19
Abstract:
Email spam is a security problem that involves different techniques in machine learning to solve this problem. The rise of this security issue makes organisation email service unreliable and has a direct relation with vulnerability of clients through unexpected spam mails, like ransomware. There are several methods to identifying spam emails. Most of these methods focused on feature selection; however, these models decreased the accuracy of the detection. This paper proposed a novel spam detection method that is not only to decrease the accuracy, but eliminates unsuitable features with less processing. The features are in the terms of contents, and the number of features is very big, so it can decrease the memory complexity. We use Hewlett-Packet (HP) laboratory samples text emails. First, GA algorithm is employed to select features without limited number of feature selection with the aid of Bayesian theory as a fitness function and checked with a different number of repetitions. The result improved with GA by increasing number of repetitions, and tested with distinctive selection method, Random selection and Tournament selection. In the second stage, the dataset classifies emails as Spam or Ham by Naive Bayes. The results show that Naive Bayes and hybrid GA-Naive Bayes are almost identical, but GA-Naive Bayes has a better performance.
Keywords: Spam detection; genetic algorithm; Naive Bayes classification; machine learning (search for similar items in EconPapers)
Date: 2019
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0219649219500084
Access to full text is restricted to subscribers
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wsi:jikmxx:v:18:y:2019:i:01:n:s0219649219500084
Ordering information: This journal article can be ordered from
DOI: 10.1142/S0219649219500084
Access Statistics for this article
Journal of Information & Knowledge Management (JIKM) is currently edited by Professor Suliman Hawamdeh
More articles in Journal of Information & Knowledge Management (JIKM) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().