EconPapers    
Economics at your fingertips  
 

PLNMFG: Pseudo-label guided non-negative matrix factorization model with graph constraint for single-cell multi-omics data clustering

Hui Yuan, Mingzhu Liu, Yushan Qiu, Wai-Ki Ching and Quan Zou

PLOS Computational Biology, 2025, vol. 21, issue 8, 1-17

Abstract: The development of single-cell multi-omics sequencing technologies has enabled the simultaneous analysis of multi-omics data within the same cell. Accurate clustering of these cells is crucial for downstream analyses of complex biological functions. Despite significant advances in multi-omics integration approaches, current methodologies exhibit two major limitations. First, they inadequately incorporate prior biological knowledge from various omic layers. Second, these methods often conduct independent dimensionality reduction on individual omic datasets, thereby failing to capture the intrinsic complementary information and potentially overlooking crucial cross-platform interactions. Motivated by these, this study investigates a non-negative matrix factorization model called PLNMFG, which integrates the unified latent representation learning that retains the features between and within omics and the cluster structure learning that retains the intrinsic structure of the data into one joint framework. Specially, PLNMFG performs adaptive imputation to handle dropout events and uses prior pseudo-labels as constraints during the process of collective non-negative matrix factorization, as a result, a more robust latent representation that preserves the double similarity information is obtained. Graph Laplacian constraint is applied during clustering which further preserves structure characteristic of multi-omics data. In addition, the weight of each omic is adaptively learned based on the omic contribution. A series of experiments on 8 benchmark datasets show that our model performs well in terms of clustering accuracy and computational efficiency.Author summary: With the rapid advancement of biotechnology, we can obtain single-cell multi-omics data including genomics, transcriptomics, epigenomics, proteomics, and metabolomics. Single-cell clustering based on these omics data can help to understand the cell heterogeneity, enabling more precise analysis of the human body at the individual cell level, thereby advancing comprehension of human systems. However, because of the high-dimensional and sparse characteristics of single-cell multi-omics data, the clustering performance is generally poor. In this paper, pseudo-label guided non-negative matrix factorization model with graph constraint (PLNMFG) is proposed for analyzing single-cell multi-omics data. It is the first time to integrate pseudo-labels, imputation and clustering based on non-negative matrix factorization and it can be conducted the different task simultaneously in a unified manner. PLNMFG combines imputation techniques with non-negative matrix factorization to further enhance clustering accuracy. It applies an adaptive omics weighting strategy to match the importance of each omic layer, giving more influence to critical omics during the clustering process. And PLNMFG employs collective matrix decomposition method based on pseudo-labeling constraints and thus avoids the traditional computationally intensive feature decomposition and similarity graph construction. Furthermore, PLNMFG applies manifold constraints in the clustering process to further preserve the data structure, it simultaneously learns the latent representation and clustering structure in the same framework, making the latent representation more suitable for clustering. Experimental results on eight different datasets indicate that PLNMFG method achieves outstanding clustering performance, fully validating its effectiveness and generalization ability.

Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013375 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 13375&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1013375

DOI: 10.1371/journal.pcbi.1013375

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-08-23
Handle: RePEc:plo:pcbi00:1013375