An Investigation of p -Hacking in E-Commerce A/B Testing
Alex P. Miller () and
Kartik Hosanagar ()
Additional contact information
Alex P. Miller: Marshall School of Business, University of Southern California, Los Angeles, California 90089
Kartik Hosanagar: The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania 19104
Information Systems Research, 2025, vol. 36, issue 3, 1691-1717
Abstract:
In recent years, randomized experiments (or “A/B tests”) have become commonplace in many industrial settings as managers increasingly seek the aid of scientific rigor in their decision making. However, just as this practice has proliferated among firms, the problem of p -hacking—whereby experimenters adjust their sample size or try several statistical analyses until they find one that produces a statistically significant p -value—has emerged as a prevalent concern in the scientific community. Notably, many commentators have highlighted how A/B testing software enables and may even encourage p -hacking behavior. To investigate this phenomenon, we analyze the prevalence of p -hacking in a primary sample of 2,270 experiments conducted by 242 firms on a large U.S.-based e-commerce A/B testing platform. Using multiple statistical techniques—including a novel approach we call the asymmetric caliper test —we analyze the p -values corresponding to each experiment’s designated target metric across multiple significance thresholds. Our findings reveal essentially no evidence for p -hacking in our data. In an extended sample that examines p -hacking across all outcome metrics (encompassing more than 16,000 p -values in total), we similarly observe no evidence of p -hacking behavior. We use simulations to determine that if a modest effect of p -hacking were present in our data set, our methods would have the power to detect it at our current sample size. We contrast our results with the prevalence of p -hacking in academic contexts and discuss a number of possible factors explaining the divergent results, highlighting the potential roles of organizational learning and economic incentives.
Keywords: p -hacking; A/B testing; data-driven decision making; electronic commerce; mixture models (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
http://dx.doi.org/10.1287/isre.2024.0872 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:orisre:v:36:y:2025:i:3:p:1691-1717
Access Statistics for this article
More articles in Information Systems Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().