Expanding drug targets for 112 chronic diseases using a machine learning-assisted genetic priority score
Robert Chen,
Áine Duffy,
Ben O. Petrazzini,
Ha My Vy,
David Stein,
Matthew Mort,
Joshua K. Park,
Avner Schlessinger,
Yuval Itan,
David N. Cooper,
Daniel M. Jordan,
Ghislain Rocheleau and
Ron Do ()
Additional contact information
Robert Chen: Icahn School of Medicine at Mount Sinai
Áine Duffy: Icahn School of Medicine at Mount Sinai
Ben O. Petrazzini: Icahn School of Medicine at Mount Sinai
Ha My Vy: Icahn School of Medicine at Mount Sinai
David Stein: Icahn School of Medicine at Mount Sinai
Matthew Mort: Cardiff University
Joshua K. Park: Icahn School of Medicine at Mount Sinai
Avner Schlessinger: Icahn School of Medicine at Mount Sinai
Yuval Itan: Icahn School of Medicine at Mount Sinai
David N. Cooper: Cardiff University
Daniel M. Jordan: Icahn School of Medicine at Mount Sinai
Ghislain Rocheleau: Icahn School of Medicine at Mount Sinai
Ron Do: Icahn School of Medicine at Mount Sinai
Nature Communications, 2024, vol. 15, issue 1, 1-16
Abstract:
Abstract Identifying genetic drivers of chronic diseases is necessary for drug discovery. Here, we develop a machine learning-assisted genetic priority score, which we call ML-GPS, that incorporates genetic associations with predicted disease phenotypes to enhance target discovery. First, we construct gradient boosting models to predict 112 chronic disease phecodes in the UK Biobank and analyze associations of predicted and observed phenotypes with common, rare, and ultra-rare variants to model the allelic series. We integrate these associations with existing evidence using gradient boosting with continuous feature encoding to construct ML-GPS, training it to predict drug indications in Open Targets and externally testing it in SIDER. We then generate ML-GPS predictions for 2,362,636 gene-phecode pairs. We find that the use of predicted phenotypes, which identify substantially more genetic associations than observed phenotypes across the allele frequency spectrum, significantly improves the performance of ML-GPS. ML-GPS increases coverage of drug targets, with the top 1% of all scores providing support for 15,077 gene-phecode pairs that previously had no support. ML-GPS can also identify well-known target-disease relationships, promising targets without indicated drugs, and targets for several drugs in clinical trials, including LRRK2 inhibitors for Parkinson’s disease and olpasiran for cardiovascular disease.
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-024-53333-y Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-53333-y
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-024-53333-y
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().