A self-conformation-aware pre-training framework for molecular property prediction with substructure interpretability
Jianbo Qiao,
Junru Jin,
Ding Wang,
Saisai Teng,
Junyu Zhang,
Xuetong Yang,
Yuhang Liu,
Yu Wang,
Lizhen Cui,
Quan Zou,
Ran Su () and
Leyi Wei ()
Additional contact information
Jianbo Qiao: Shandong University
Junru Jin: Shandong University
Ding Wang: Shandong University
Saisai Teng: Shandong University
Junyu Zhang: Shandong University
Xuetong Yang: Shandong University
Yuhang Liu: Macao Polytechnic University
Yu Wang: Shandong University
Lizhen Cui: Shandong University
Quan Zou: University of Electronic Science and Technology of China
Ran Su: Tianjin University
Leyi Wei: Macao Polytechnic University
Nature Communications, 2025, vol. 16, issue 1, 1-16
Abstract:
Abstract The major challenges in drug development stem from frequent structure-activity cliffs and unknown drug properties, which are expensive and time-consuming to estimate, contributing to a high rate of failures and substantial unavoidable costs in the clinical phases. Herein, we propose the self-conformation-aware graph transformer (SCAGE), an innovative deep learning architecture pretrained with approximately 5 million drug-like compounds for molecular property prediction. Notably, we develop a multitask pretraining framework, which incorporates four supervised and unsupervised tasks: molecular fingerprint prediction, functional group prediction using chemical prior information, 2D atomic distance prediction, and 3D bond angle prediction, covering aspects from molecular structures to functions. It enables learning comprehensive conformation-aware prior knowledge, thereby enhancing its generalization across various molecular property tasks. Moreover, we design a data-driven multiscale conformational learning strategy that effectively guides the model in understanding and representing atomic relationships at the molecular conformational scale. SCAGE achieves significant performance improvements across 9 molecular properties and 30 structure-activity cliff benchmarks. Case studies demonstrate that SCAGE accurately captures crucial functional groups at the atomic level, which are closely associated with molecular activity, providing valuable insights into quantitative structure-activity relationships.
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-025-59634-0 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-59634-0
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-025-59634-0
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().