EconPapers    
Economics at your fingertips  
 

Detecting Webspam Beneficiaries Using Information Collected by the Random Surfer

Thomas Largillier and Sylvain Peyronnet
Additional contact information
Thomas Largillier: LRI, Université Paris-Sud, F-91405, France
Sylvain Peyronnet: LRI, Université Paris-Sud, F-91405, France

International Journal of Organizational and Collective Intelligence (IJOCI), 2011, vol. 2, issue 2, 36-48

Abstract: Search engines use several criteria to rank webpages and choose which pages to display when answering a request. Those criteria can be separated into two notions, relevance and popularity. The notion of popularity is calculated by the search engine and is related to links made to the webpage. Malicious webmasters want to artificially increase their popularity; the techniques they use are often referred to as Webspam. It can take many forms and is in constant evolution, but Webspam usually consists of building a specific dedicated structure of spam pages around a given target page. It is important for a search engine to address the issue of Webspam; otherwise, it cannot provide users with fair and reliable results. In this paper, the authors propose a technique to identify Webspam through the frequency language associated with random walks among those dedicated structures. The authors identify the language by calculating the frequency of appearance of k-grams on random walks launched from every node.

Date: 2011
References: Add references at CitEc
Citations: Track citations by RSS feed

Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 4018/joci.2011040103 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:igg:joci00:v:2:y:2011:i:2:p:36-48

Access Statistics for this article

More articles in International Journal of Organizational and Collective Intelligence (IJOCI) from IGI Global
Bibliographic data for series maintained by Journal Editor ().

 
Page updated 2019-11-24
Handle: RePEc:igg:joci00:v:2:y:2011:i:2:p:36-48