scTab: Scaling cross-tissue single-cell annotation models
Felix Fischer,
David S. Fischer,
Roman Mukhin,
Andrey Isaev,
Evan Biederstedt,
Alexandra-Chloé Villani and
Fabian J. Theis ()
Additional contact information
Felix Fischer: Institute of Computational Biology
David S. Fischer: Institute of Computational Biology
Roman Mukhin: eBook Applications LLC
Andrey Isaev: eBook Applications LLC
Evan Biederstedt: Harvard Medical School
Alexandra-Chloé Villani: Broad Institute of MIT and Harvard
Fabian J. Theis: Institute of Computational Biology
Nature Communications, 2024, vol. 15, issue 1, 1-15
Abstract:
Abstract Identifying cellular identities is a key use case in single-cell transcriptomics. While machine learning has been leveraged to automate cell annotation predictions for some time, there has been little progress in scaling neural networks to large data sets and in constructing models that generalize well across diverse tissues. Here, we propose scTab, an automated cell type prediction model specific to tabular data, and train it using a novel data augmentation scheme across a large corpus of single-cell RNA-seq observations (22.2 million cells). In this context, we show that cross-tissue annotation requires nonlinear models and that the performance of scTab scales both in terms of training dataset size and model size. Additionally, we show that the proposed data augmentation schema improves model generalization. In summary, we introduce a de novo cell type prediction model for single-cell RNA-seq data that can be trained across a large-scale collection of curated datasets and demonstrate the benefits of using deep learning methods in this paradigm.
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-024-51059-5 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-51059-5
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-024-51059-5
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().