EconPapers    
Economics at your fingertips  
 

Mcadet: A feature selection method for fine-resolution single-cell RNA-seq data based on multiple correspondence analysis and community detection

Saishi Cui, Sina Nassiri and Issa Zakeri

PLOS Computational Biology, 2024, vol. 20, issue 10, 1-39

Abstract: Single-cell RNA sequencing (scRNA-seq) data analysis faces numerous challenges, including high sparsity, a high-dimensional feature space, and biological noise. These challenges hinder downstream analysis, necessitating the use of feature selection methods to identify informative genes, and reduce data dimensionality. However, existing methods for selecting highly variable genes (HVGs) exhibit limited overlap and inconsistent clustering performance across benchmark datasets. Moreover, these methods often struggle to accurately select HVGs from fine-resolution scRNA-seq datasets and minority cell types, which are more difficult to distinguish, raising concerns about the reliability of their results. To overcome these limitations, we propose a novel feature selection framework for scRNA-seq data called Mcadet. Mcadet integrates Multiple Correspondence Analysis (MCA), graph-based community detection, and a novel statistical testing approach. To assess the effectiveness of Mcadet, we conducted extensive evaluations using both simulated and real-world data, employing unbiased metrics for comparison. Our results demonstrate the superior performance of Mcadet in the selection of HVGs in scenarios involving fine-resolution scRNA-seq datasets and datasets containing minority cell populations. Overall, we demonstrate that Mcadet enhances the reliability of selected HVGs, although the impact of HVG selection on various downstream analyses varies and needs to be further investigated.Author summary: scRNA-seq brings both great opportunities and challenges for transcriptomic analysis. While scRNA-seq enables the characterization of cell heterogeneity at an unprecedented resolution, analytical issues like sparsity, noise and bias can severely compromise interpretation if not addressed properly. To extract meaningful biological signals, effective feature selection is critical. We propose Mcadet, a novel framework for feature selection in scRNA-seq data. Mcadet aims to accurately identify informative genes from fine-resolution datasets and datasets with minority cell types where existing methods falter.

Date: 2024
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1012560 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 12560&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1012560

DOI: 10.1371/journal.pcbi.1012560

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-05-31
Handle: RePEc:plo:pcbi00:1012560