EconPapers    
Economics at your fingertips  
 

A component overlapping attribute clustering (COAC) algorithm for single-cell RNA sequencing data analysis and potential pathobiological implications

He Peng, Xiangxiang Zeng, Yadi Zhou, Defu Zhang, Ruth Nussinov and Feixiong Cheng

PLOS Computational Biology, 2019, vol. 15, issue 2, 1-17

Abstract: Recent advances in next-generation sequencing and computational technologies have enabled routine analysis of large-scale single-cell ribonucleic acid sequencing (scRNA-seq) data. However, scRNA-seq technologies have suffered from several technical challenges, including low mean expression levels in most genes and higher frequencies of missing data than bulk population sequencing technologies. Identifying functional gene sets and their regulatory networks that link specific cell types to human diseases and therapeutics from scRNA-seq profiles are daunting tasks. In this study, we developed a Component Overlapping Attribute Clustering (COAC) algorithm to perform the localized (cell subpopulation) gene co-expression network analysis from large-scale scRNA-seq profiles. Gene subnetworks that represent specific gene co-expression patterns are inferred from the components of a decomposed matrix of scRNA-seq profiles. We showed that single-cell gene subnetworks identified by COAC from multiple time points within cell phases can be used for cell type identification with high accuracy (83%). In addition, COAC-inferred subnetworks from melanoma patients’ scRNA-seq profiles are highly correlated with survival rate from The Cancer Genome Atlas (TCGA). Moreover, the localized gene subnetworks identified by COAC from individual patients’ scRNA-seq data can be used as pharmacogenomics biomarkers to predict drug responses (The area under the receiver operating characteristic curves ranges from 0.728 to 0.783) in cancer cell lines from the Genomics of Drug Sensitivity in Cancer (GDSC) database. In summary, COAC offers a powerful tool to identify potential network-based diagnostic and pharmacogenomics biomarkers from large-scale scRNA-seq profiles. COAC is freely available at https://github.com/ChengF-Lab/COAC.Author summary: Single-cell RNA sequencing (scRNA-seq) can reveal complex and rare cell populations, uncover gene regulatory relationships, track the trajectories of distinct cell lineages in development, and identify cell-cell variabilities in human diseases and therapeutics. Although experimental methods for scRNA-seq are increasingly accessible, computational approaches to infer gene regulatory networks from raw data remain limited. From a single-cell perspective, the stochastic features of a single cell must be properly embedded into gene regulatory networks. However, it is difficult to identify technical noise (e.g., low mean expression levels and missing data) and cell-cell variabilities remain poorly understood. In this study, we introduced a network-based approach, termed Component Overlapping Attribute Clustering (COAC), to infer novel gene-gene subnetworks in individual components (subsets of whole components) representing multiple cell types and phases of scRNA-seq data. We showed that COAC can reduce batch effects and identify specific cell types in two large-scale human scRNA-seq datasets. Importantly, we demonstrated that gene subnetworks identified by COAC from scRNA-seq profiles highly correlated with patients's survival and drug responses in cancer, offering a novel computational tool for advancing precision medicine.

Date: 2019
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006772 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 06772&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1006772

DOI: 10.1371/journal.pcbi.1006772

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-03-19
Handle: RePEc:plo:pcbi00:1006772