EconPapers    
Economics at your fingertips  
 

Secuer: Ultrafast, scalable and accurate clustering of single-cell RNA-seq data

Nana Wei, Yating Nie, Lin Liu, Xiaoqi Zheng and Hua-Jun Wu

PLOS Computational Biology, 2022, vol. 18, issue 12, 1-20

Abstract: Identifying cell clusters is a critical step for single-cell transcriptomics study. Despite the numerous clustering tools developed recently, the rapid growth of scRNA-seq volumes prompts for a more (computationally) efficient clustering method. Here, we introduce Secuer, a Scalable and Efficient speCtral clUstERing algorithm for scRNA-seq data. By employing an anchor-based bipartite graph representation algorithm, Secuer enjoys reduced runtime and memory usage over one order of magnitude for datasets with more than 1 million cells. Meanwhile, Secuer also achieves better or comparable accuracy than competing methods in small and moderate benchmark datasets. Furthermore, we showcase that Secuer can also serve as a building block for a new consensus clustering method, Secuer-consensus, which again improves the runtime and scalability of state-of-the-art consensus clustering methods while also maintaining the accuracy. Overall, Secuer is a versatile, accurate, and scalable clustering framework suitable for small to ultra-large single-cell clustering tasks.Author summary: Recently, single-cell RNA sequencing (scRNA-seq) has enabled profiling of thousands to millions of cells, spurring the development of efficient clustering algorithms for large or ultra-large datasets. In this work, we developed an ultrafast clustering method, Secuer, for small to ultra-large scRNA-seq data. Using simulation and real datasets, we demonstrated that Secuer yields high accuracy, while saving runtime and memory usage by orders of magnitude, and that it can be efficiently scaled up to ultra-large datasets. Additionally, with Secuer as a subroutine, we proposed Secuer-consensus, a consensus clustering algorithm. Our results show that Secuer-consensus performs better in terms of clustering accuracy and runtime.

Date: 2022
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010753 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 10753&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1010753

DOI: 10.1371/journal.pcbi.1010753

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-05-04
Handle: RePEc:plo:pcbi00:1010753