Cherry Picking

Lang, Megan; Qiu, Wenfeng

Cherry Picking

Megan Lang () and Wenfeng Qiu

No as9zd, MetaArXiv from Center for Open Science

Abstract: Measures like pre-analysis plans ask researchers to describe planned data collection and justify data exclusions, but they provide little enforceable oversight of primary data collection. We show that a simple algorithm can select large subsets of data that yield economically meaningful and statistically significant treatment effects. The subsets cannot be distinguished from a random sample of the original data, rendering the selection undetectable if peer reviewers are unaware of the size of the original dataset. Our results hold using simulated data and replication data from a well-known study. We show that there are few natural deterrents to dataset manipulation: the results in our selected subset are robust to a range of alternative specifications, our algorithm performs well under complex sampling strategies, and our subset can yield artificially high effects on multiple outcomes. We conclude by proposing a measure to prevent such manipulation in field experiments.

Date: 2021-08-24
New Economics Papers: this item is included in nep-ecm and nep-isf
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://osf.io/download/61256d816a7f6d001f47ab8a/

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:osf:metaar:as9zd

DOI: 10.31219/osf.io/as9zd

Access Statistics for this paper

More papers in MetaArXiv from Center for Open Science
Bibliographic data for series maintained by OSF ().