EconPapers    
Economics at your fingertips  
 

The Impact of the Mode of Data Representation for the Result Quality of the Detection and Filtering of Spam

Reda Mohamed Hamou and Abdelmalek Amine
Additional contact information
Reda Mohamed Hamou: Computer Science Department, Dr Moulay Tahar University of Saïda, Saïda, Algeria
Abdelmalek Amine: Computer Science Department, Dr Moulay Tahar University of Saïda, Saïda, Algeria

International Journal of Information Retrieval Research (IJIRR), 2013, vol. 3, issue 1, 43-59

Abstract: Spam is now seized of the Internet in phenomenal proportions since it high represents a percentage of total emails exchanged on the Internet. In the fight against spam, the authors are interested in this article to develop a hybrid algorithm based primarily on the probabilistic model in this case Naïve Bayes for weighting the terms of the matrix term -category and second place used an algorithm of unsupervised learning (K-means) to filter two classes namely spam and ham. To determine the sensitive parameters that improve the classifications the authors are interested in studying the content of the messages by using a representation of messages by the n-gram words and characters independent of languages (because a message may be received in any language) to later decide what representation opt to get a good classification. The authors have chosen several metrics evaluation to validate their results.

Date: 2013
References: Add references at CitEc
Citations:

Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 018/ijirr.2013010103 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:igg:jirr00:v:3:y:2013:i:1:p:43-59

Access Statistics for this article

International Journal of Information Retrieval Research (IJIRR) is currently edited by Zhongyu Lu

More articles in International Journal of Information Retrieval Research (IJIRR) from IGI Global
Bibliographic data for series maintained by Journal Editor ().

 
Page updated 2025-03-19
Handle: RePEc:igg:jirr00:v:3:y:2013:i:1:p:43-59