EconPapers    
Economics at your fingertips  
 

Are Neighbors Alike? A Semisupervised Probabilistic Collaborative Learning Model for Online Review Spammers Detection

Zhiang Wu (), Guannan Liu (), Junjie Wu () and Yong Tan ()
Additional contact information
Zhiang Wu: School of Computer Science, Nanjing Audit University, Nanjing 211815, China
Guannan Liu: School of Economics and Management, Beihang University, Beijing 100191, China; Key Laboratory of Data Intelligence and Management, Ministry of Industry and Information Technology, Beijing 100191, China
Junjie Wu: School of Economics and Management, Beihang University, Beijing 100191, China; Key Laboratory of Data Intelligence and Management, Ministry of Industry and Information Technology, Beijing 100191, China
Yong Tan: Michael G. Foster School of Business, University of Washington, Seattle, Washington 98195

Information Systems Research, 2024, vol. 35, issue 4, 1565-1585

Abstract: Review spammers can harm the trustworthy environment of online platforms by purposefully posting unauthentic ratings and comments for products or online merchants, with the aim of gaining improper benefits. Although many methods have been proposed to resolve the spammer detection problem, several challenges, such as collusion recognition, label scarcity, and biased distributions, are still persistent and call for further investigation. Building on prevalent collusive spamming behaviors and the network homophily theory, we introduce a reviewer network to account for explicit coreview relations, and then, we propose a semisupervised probabilistic collaborative learning model to capture both reviewers’ individual behavioral features and the reviewer network. Our model features integrating partial label propagation with a pseudolabeling strategy and feature-based learning for reviewer network modeling, which is proved theoretically to be a weighted logistic regression on a network-derived synthetic data set. The rich parameters that characterize the importance of network information, the strength of network homophily, and the value of unlabeled data make our model more transparent. The empirical evaluations on two distinctive real-life data sets have demonstrated the effectiveness of our model and the significance of unlabeled data, in which the reviewer network after proper trimming demonstrates notable homophily effects and plays a vital role. In particular, the proposed model exhibits robustness against label scarcity and biased label distribution.

Keywords: spammer detection; semisupervised; collaborative learning; reviewer network; homophily effect (search for similar items in EconPapers)
Date: 2024
References: Add references at CitEc
Citations:

Downloads: (external link)
http://dx.doi.org/10.1287/isre.2022.0047 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:orisre:v:35:y:2024:i:4:p:1565-1585

Access Statistics for this article

More articles in Information Systems Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().

 
Page updated 2025-03-19
Handle: RePEc:inm:orisre:v:35:y:2024:i:4:p:1565-1585