False Discovery in A/B Testing
Ron Berman () and
Christophe Van den Bulte ()
Additional contact information
Ron Berman: Marketing, The Wharton School of the University of Pennsylvania, Philadelphia, Pennsylvania 19104
Christophe Van den Bulte: Marketing, The Wharton School of the University of Pennsylvania, Philadelphia, Pennsylvania 19104
Management Science, 2022, vol. 68, issue 9, 6762-6782
Abstract:
We investigate what fraction of all significant results in website A/B testing is actually null effects (i.e., the false discovery rate (FDR)). Our data consist of 4,964 effects from 2,766 experiments conducted on a commercial A/B testing platform. Using three different methods, we find that the FDR ranges between 28% and 37% for tests conducted at 10% significance and between 18% and 25% for tests at 5% significance (two sided). These high FDRs stem mostly from the high fraction of true null effects, about 70%, rather than from low power. Using our estimates, we also assess the potential of various A/B test designs to reduce the FDR. The two main implications are that decision makers should expect one in five interventions achieving significance at 5% confidence to be ineffective when deployed in the field and that analysts should consider using two-stage designs with multiple variations rather than basic A/B tests.
Keywords: statistics; design of experiments; decision analysis; inference; A/B testing; false discovery rate (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (3)
Downloads: (external link)
http://dx.doi.org/10.1287/mnsc.2021.4207 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:ormnsc:v:68:y:2022:i:9:p:6762-6782
Access Statistics for this article
More articles in Management Science from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().