Enhancing cancer drug discovery: QSAR modeling with machine learning and chemical representations
Raúl Acosta-Murillo,
José Carlos Ortiz-Bayliss and
Patricio Adrian Zapata-Morin
PLOS ONE, 2026, vol. 21, issue 3, 1-29
Abstract:
Accurately predicting the bioactivity of small molecules against cancer therapeutic targets remains a significant challenge at the intersection of cheminformatics and drug discovery. This study comprehensively evaluates chemical representations, including AtomPair Counts (APC),Avalon (AVN), Extended-Connectivity Fingerprint diameter 4 (ECFP4), Extended-Connectivity Fingerprint diameter 6 (ECFP6), Feature-based Morgan 2 (FM2), Feature-based Morgan 3 (FM3), Mol2Vec (M2V), Molecular ACCess System (MACCS), Mordred 2D Chi Kappa (MK2), RDKFingerprint (RDF), Rdkit PhysChem (RDC), Torsion (TSN) combined with machine learning algorithms (Bayesian Ridge (BRG), Elastic Net (ENT), Extra Trees (ETT), Hist Gradient Boosting (HGT), K-Nearest Neighbors (kNN), Lasso (LSS), Multi-layer Perceptron (MLP), Partial least squares (PLS), Random Forest (RFT), Ridge (RDG), Support Vector Regressor (SVR), and XGBoost (XGB)) for predicting cancer bioactivities. The results show that while AVN chemical representation, in conjunction with SVR algorithm, achieved the highest predictive accuracy, with R2 of 0.735 in FGFR1 dataset; The mTOR dataset demonstrated the highest average performance across all models and chemical representations, with an R2 of 0.592 across various cancer datasets. These findings demonstrate how cheminformatics tools like molecular fingerprints and quantitative structure-activity relationship (QSAR) modeling can significantly enhance bioactivity prediction, ultimately contributing to more efficient and targeted cancer drug discovery.
Date: 2026
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0343654 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 43654&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0343654
DOI: 10.1371/journal.pone.0343654
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().