Analysis validation has been neglected in the Age of Reproducibility
Kathleen E Lotterhos,
Jason H Moore and
Ann E Stapleton
PLOS Biology, 2018, vol. 16, issue 12, 1-15
Abstract:
Increasingly complex statistical models are being used for the analysis of biological data. Recent commentary has focused on the ability to compute the same outcome for a given dataset (reproducibility). We argue that a reproducible statistical analysis is not necessarily valid because of unique patterns of nonindependence in every biological dataset. We advocate that analyses should be evaluated with known-truth simulations that capture biological reality, a process we call “analysis validation.” We review the process of validation and suggest criteria that a validation project should meet. We find that different fields of science have historically failed to meet all criteria, and we suggest ways to implement meaningful validation in training and practice.Just as we do controls for experiments we should all do controls for data analysis – this is easy to say but requires dedication to implement. This Essay explains the need for analysis validation and provides specific suggestions for how to get started.
Date: 2018
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3000070 (text/html)
https://journals.plos.org/plosbiology/article/file ... 00070&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pbio00:3000070
DOI: 10.1371/journal.pbio.3000070
Access Statistics for this article
More articles in PLOS Biology from Public Library of Science
Bibliographic data for series maintained by plosbiology ().