Detecting Webspam Beneficiaries Using Information Collected by the Random Surfer
Thomas Largillier and
Additional contact information
Thomas Largillier: LRI, UniversitÃ© Paris-Sud, F-91405, France
Sylvain Peyronnet: LRI, UniversitÃ© Paris-Sud, F-91405, France
International Journal of Organizational and Collective Intelligence (IJOCI), 2011, vol. 2, issue 2, 36-48
Search engines use several criteria to rank webpages and choose which pages to display when answering a request. Those criteria can be separated into two notions, relevance and popularity. The notion of popularity is calculated by the search engine and is related to links made to the webpage. Malicious webmasters want to artificially increase their popularity; the techniques they use are often referred to as Webspam. It can take many forms and is in constant evolution, but Webspam usually consists of building a specific dedicated structure of spam pages around a given target page. It is important for a search engine to address the issue of Webspam; otherwise, it cannot provide users with fair and reliable results. In this paper, the authors propose a technique to identify Webspam through the frequency language associated with random walks among those dedicated structures. The authors identify the language by calculating the frequency of appearance of k-grams on random walks launched from every node.
References: Add references at CitEc
Citations: Track citations by RSS feed
Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 4018/joci.2011040103 (application/pdf)
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
Persistent link: https://EconPapers.repec.org/RePEc:igg:joci00:v:2:y:2011:i:2:p:36-48
Access Statistics for this article
More articles in International Journal of Organizational and Collective Intelligence (IJOCI) from IGI Global
Bibliographic data for series maintained by Journal Editor ().