Estimating disease prevalence in large datasets using genetic risk scores
Benjamin D. Evans,
Piotr Słowiński,
Andrew T. Hattersley,
Samuel E. Jones,
Seth Sharp,
Robert A. Kimmitt,
Michael N. Weedon,
Richard A. Oram,
Krasimira Tsaneva-Atanasova and
Nicholas J. Thomas ()
Additional contact information
Benjamin D. Evans: University of Exeter
Piotr Słowiński: University of Exeter
Andrew T. Hattersley: University of Exeter Medical School, Institute of Biomedical & Clinical Science
Samuel E. Jones: University of Exeter Medical School, Institute of Biomedical & Clinical Science
Seth Sharp: University of Exeter Medical School, Institute of Biomedical & Clinical Science
Robert A. Kimmitt: University of Exeter Medical School, Institute of Biomedical & Clinical Science
Michael N. Weedon: University of Exeter Medical School, Institute of Biomedical & Clinical Science
Richard A. Oram: University of Exeter Medical School, Institute of Biomedical & Clinical Science
Krasimira Tsaneva-Atanasova: University of Exeter
Nicholas J. Thomas: University of Exeter
Nature Communications, 2021, vol. 12, issue 1, 1-12
Abstract:
Abstract Clinical classification is essential for estimating disease prevalence but is difficult, often requiring complex investigations. The widespread availability of population level genetic data makes novel genetic stratification techniques a highly attractive alternative. We propose a generalizable mathematical framework for determining disease prevalence within a cohort using genetic risk scores. We compare and evaluate methods based on the means of genetic risk scores’ distributions; the Earth Mover’s Distance between distributions; a linear combination of kernel density estimates of distributions; and an Excess method. We demonstrate the performance of genetic stratification to produce robust prevalence estimates. Specifically, we show that robust estimates of prevalence are still possible even with rarer diseases, smaller cohort sizes and less discriminative genetic risk scores, highlighting the general utility of these approaches. Genetic stratification techniques offer exciting new research tools, enabling unbiased insights into disease prevalence and clinical characteristics unhampered by clinical classification criteria.
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-021-26501-7 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:12:y:2021:i:1:d:10.1038_s41467-021-26501-7
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-021-26501-7
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().