scDSSC: Deep Sparse Subspace Clustering for scRNA-seq Data
HaiYun Wang,
JianPing Zhao,
ChunHou Zheng and
YanSen Su
PLOS Computational Biology, 2022, vol. 18, issue 12, 1-18
Abstract:
Single cell RNA sequencing (scRNA-seq) enables researchers to characterize transcriptomic profiles at the single-cell resolution with increasingly high throughput. Clustering is a crucial step in single cell analysis. Clustering analysis of transcriptome profiled by scRNA-seq can reveal the heterogeneity and diversity of cells. However, single cell study still remains great challenges due to its high noise and dimension. Subspace clustering aims at discovering the intrinsic structure of data in unsupervised fashion. In this paper, we propose a deep sparse subspace clustering method scDSSC combining noise reduction and dimensionality reduction for scRNA-seq data, which simultaneously learns feature representation and clustering via explicit modelling of scRNA-seq data generation. Experiments on a variety of scRNA-seq datasets from thousands to tens of thousands of cells have shown that scDSSC can significantly improve clustering performance and facilitate the interpretability of clustering and downstream analysis. Compared to some popular scRNA-deq analysis methods, scDSSC outperformed state-of-the-art methods under various clustering performance metrics.Author summary: Single cell RNA sequencing (scRNA-seq) data has been widely used in neuroscience, immunology, oncology and other research fields. Cell type recognition is an important goal of scRNA-seq data analysis, in which clustering analysis is commonly used. However, single cell clustering still remains great challenges due to its high noise, dimension and increasing data scale. Considering the advantages of subspace manifold in processing high-dimensional data and the powerful representation learning ability of deep neural network, we proposed a novel single-cell data clustering method scDSSC, which imitates the generation of scRNA-seq data and reduces the dimension and noise of the data at the same time, and finally outputs the clustering results. Experiments on a variety of scRNA-seq datasets from thousands to tens of thousands of cells have shown that scDSSC can significantly improve downstream analysis, including clustering analysis, cell visualization, differential expression analysis and trajectory inference. In addition, scDSSC has good scalability and can handle large-scale scRNA-seq data.
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010772 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 10772&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1010772
DOI: 10.1371/journal.pcbi.1010772
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().