STAREG: Statistical replicability analysis of high throughput experiments with applications to spatial transcriptomic studies
Yan Li,
Xiang Zhou,
Rui Chen,
Xianyang Zhang and
Hongyuan Cao
PLOS Genetics, 2024, vol. 20, issue 10, 1-19
Abstract:
Replicable signals from different yet conceptually related studies provide stronger scientific evidence and more powerful inference. We introduce STAREG, a statistical method for replicability analysis of high throughput experiments, and apply it to analyze spatial transcriptomic studies. STAREG uses summary statistics from multiple studies of high throughput experiments and models the the joint distribution of p-values accounting for the heterogeneity of different studies. It effectively controls the false discovery rate (FDR) and has higher power by information borrowing. Moreover, it provides different rankings of important genes. With the EM algorithm in combination with pool-adjacent-violator-algorithm (PAVA), STAREG is scalable to datasets with millions of genes without any tuning parameters. Analyzing two pairs of spatially resolved transcriptomic datasets, we are able to make biological discoveries that otherwise cannot be obtained by using existing methods.Author summary: Irreplicable research wastes time, money, and/or resources. Approximately $28 billion is estimated to be spent on preclinical research that cannot be replicated every year in the United States alone. Possible causes of irreplicable research may include experimental design, laboratory practices, and data analysis. We focus on data analysis. The past two decades have witnessed the expansion and increased availability of genomic data from high-throughput experiments. Due to privacy concerns or logistic reasons, raw data can be difficult to access but summary data such as p-values are readily available. We introduce STAREG, which jointly analyzes p-values from multiple genomic datasets that target the same scientific question with different populations or different technologies. This allows us to have more convincing and robust findings. STAREG is computationally scalable with solid statistical analysis. Moreover, it is versatile, platform-independent, and only requires p-values as input. By analyzing data sets from spatially resolved transcriptomic studies, we make biological discoveries that otherwise cannot be obtained with existing methods.
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1011423 (text/html)
https://journals.plos.org/plosgenetics/article/fil ... 11423&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pgen00:1011423
DOI: 10.1371/journal.pgen.1011423
Access Statistics for this article
More articles in PLOS Genetics from Public Library of Science
Bibliographic data for series maintained by plosgenetics ().