GenomicSuperSignature facilitates interpretation of RNA-seq experiments through robust, efficient comparison to public databases
Sehyun Oh,
Ludwig Geistlinger,
Marcel Ramos,
Daniel Blankenberg,
Marius Beek,
Jaclyn N. Taroni,
Vincent J. Carey,
Casey S. Greene,
Levi Waldron and
Sean Davis ()
Additional contact information
Sehyun Oh: City University of New York
Ludwig Geistlinger: Harvard Medical School
Marcel Ramos: City University of New York
Daniel Blankenberg: Lerner Research Institute, Cleveland Clinic
Marius Beek: The Pennsylvania State University
Jaclyn N. Taroni: Alex’s Lemonade Stand Foundation
Vincent J. Carey: Harvard Medical School
Casey S. Greene: University of Colorado Anschutz School of Medicine
Levi Waldron: City University of New York
Sean Davis: University of Colorado Anschutz School of Medicine
Nature Communications, 2022, vol. 13, issue 1, 1-10
Abstract:
Abstract Millions of transcriptomic profiles have been deposited in public archives, yet remain underused for the interpretation of new experiments. We present a method for interpreting new transcriptomic datasets through instant comparison to public datasets without high-performance computing requirements. We apply Principal Component Analysis on 536 studies comprising 44,890 human RNA sequencing profiles and aggregate sufficiently similar loading vectors to form Replicable Axes of Variation (RAV). RAVs are annotated with metadata of originating studies and by gene set enrichment analysis. Functionality to associate new datasets with RAVs, extract interpretable annotations, and provide intuitive visualization are implemented as the GenomicSuperSignature R/Bioconductor package. We demonstrate the efficient and coherent database search, robustness to batch effects and heterogeneous training data, and transfer learning capacity of our method using TCGA and rare diseases datasets. GenomicSuperSignature aids in analyzing new gene expression data in the context of existing databases using minimal computing resources.
Date: 2022
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-022-31411-3 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-31411-3
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-022-31411-3
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().