EconPapers    
Economics at your fingertips  
 

Multiplex networks-based directed graph neural network for cancer driver gene identification

Pingting Li and Minzhu Xie

PLOS Computational Biology, 2026, vol. 22, issue 5, 1-19

Abstract: Identifying cancer driver genes is crucial in precision oncology. Most existing methods rely on a single interaction network to capture gene relationships. However, with the increasing availability of multi-omics and biological network data, integrating multiplex networks offers a more comprehensive representation of the complex and directional regulatory interactions among genes. Moreover, the number of validated cancer driver genes remains small compared with the vast number of unlabeled genes, leading to label scarcity and class imbalance. To address these limitations, we propose a multiplex networks-based directed graph neural network (MNDGNN). The model learns gene representations on multiplex networks with multi-omics data through directed graph convolution, which integrates neighbor diversity and degree diversity. We also incorporate data augmentation combining positive-sample augmentation with negative-sample inference to mitigate label scarcity. Experimental results show that the proposed method achieves better predictive performance and robustness than existing state-of-the-art methods. The predicted cancer driver genes are significantly enriched in cancer-related pathways and exhibit extensive interactions with known cancer driver genes, offering a new perspective for cancer driver gene discovery and the design of therapeutic strategies.Author summary: Cancer genomes often contain many mutations, but only a small fraction actively promote tumor growth. Therefore, distinguishing driver mutations from the vast background of passenger mutations is a critical task for understanding disease mechanisms and developing targeted therapies. Although large-scale sequencing has enabled the discovery of hundreds of cancer driver genes, many of these genes remain difficult to interpret because relevant evidence is scattered across different data types and biological interaction networks, and only a limited set has been experimentally validated. In this study, we develop a computational approach that integrates multi-omics data with multiplex biological interaction networks, rather than relying on a single network. We also incorporate directionality in regulatory relationships to better reflect how signals propagate through gene networks. In addition, we employ a data augmentation strategy to facilitate effective learning under label scarcity. Our method improves predictive performance over existing approaches and prioritizes candidate cancer driver genes that are strongly connected to known cancer driver genes and enriched in cancer-relevant pathways, providing a practical shortlist for downstream experimental validation and therapeutic target discovery.

Date: 2026
References: Add references at CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1014275 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 14275&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1014275

DOI: 10.1371/journal.pcbi.1014275

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2026-05-18
Handle: RePEc:plo:pcbi00:1014275