EconPapers    
Economics at your fingertips  
 

Mining search intents for collaborative cyberporn filtering

Lung‐Hao Lee and Hsin‐Hsi Chen

Journal of the American Society for Information Science and Technology, 2012, vol. 63, issue 2, 366-376

Abstract: This article presents a search‐intent‐based method to generate pornographic blacklists for collaborative cyberporn filtering. A novel porn‐detection framework that can find newly appearing pornographic web pages by mining search query logs is proposed. First, suspected queries are identified along with their clicked URLs by an automatically constructed lexicon. Then, a candidate URL is determined if the number of clicks satisfies majority voting rules. Finally, a candidate whose URL contains at least one categorical keyword will be included in a blacklist. Several experiments are conducted on an MSN search porn dataset to demonstrate the effectiveness of our method. The resulting blacklist generated by our search‐intent‐based method achieves high precision (0.701) while maintaining a favorably low false‐positive rate (0.086). The experiments of a real‐life filtering simulation reveal that our proposed method with its accumulative update strategy can achieve 44.15% of a macro‐averaging blocking rate, when the update frequency is set to 1 day. In addition, the overblocking rates are less than 9% with time change due to the strong advantages of our search‐intent‐based method. This user‐behavior‐oriented method can be easily applied to search engines for incorporating only implicit collective intelligence from query logs without other efforts. In practice, it is complementary to intelligent content analysis for keeping up with the changing trails of objectionable websites from users' perspectives.

Date: 2012
References: Add references at CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1002/asi.21668

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:jamist:v:63:y:2012:i:2:p:366-376

Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1532-2890

Access Statistics for this article

More articles in Journal of the American Society for Information Science and Technology from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().

 
Page updated 2025-03-19
Handle: RePEc:bla:jamist:v:63:y:2012:i:2:p:366-376