EconPapers    
Economics at your fingertips  
 

Duplicate detection in pay-per-click streams using temporal stateful Bloom filters

Chamila Walgampaya, Mehmed Kantardzic and Brent Wenerstrom

International Journal of Data Analysis Techniques and Strategies, 2012, vol. 4, issue 4, 340-377

Abstract: Detecting duplicates in click data streams is an important task to fight against click fraud, which is the act of generating false clicks in internet advertising. Revenue generation advertising models, that charge advertisers for each click, leave space for individuals or rival companies to generate false clicks. The extent of click fraud's damage to online advertising has grown tremendously over the years. In this paper, we consider the problem of detecting duplicates in click data streams. Our solution uses a modified version of the counting Bloom filter. The temporal stateful Bloom filter (TSBF) extends the standard counting Bloom filter by replacing the bit-vector with an array of counters of states. These counters are dynamic and decay with time. We conducted a comprehensive set of experiments using synthetic and real world data. Results are compared with buffering techniques used in NetMosaics, a click fraud detection and prevention solution. Our results show that TSBF approach achieves 99% accuracy on duplicate detection, while keeping its space requirement a constant.

Keywords: click fraud; Bloom filters; BFs; streaming data; duplicate detection; pay-per-click advertising models; duplicates; click data streams; false clicks; internet advertising. (search for similar items in EconPapers)
Date: 2012
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.inderscience.com/link.php?id=50405 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ids:injdan:v:4:y:2012:i:4:p:340-377

Access Statistics for this article

More articles in International Journal of Data Analysis Techniques and Strategies from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().

 
Page updated 2025-03-19
Handle: RePEc:ids:injdan:v:4:y:2012:i:4:p:340-377