Hypothesis-free phenotype prediction within a genetics-first framework
Chang Lu,
Jan Zaucha,
Rihab Gam,
Hai Fang,
Smithers,
Matt E. Oates,
Miguel Bernabe-Rubio,
James Williams,
Natalie Zelenka,
Arun Prasad Pandurangan,
Himani Tandon,
Hashem Shihab,
Raju Kalaivani,
Minkyung Sung,
Adam J. Sardar,
Bastian Greshake Tzovoras,
Davide Danovi and
Julian Gough ()
Additional contact information
Chang Lu: Cambridge Biomedical Campus
Jan Zaucha: University of Bristol
Rihab Gam: Cambridge Biomedical Campus
Hai Fang: University of Bristol
Smithers: University of Bristol
Matt E. Oates: University of Bristol
Miguel Bernabe-Rubio: King’s College London, Guy’s Hospital
James Williams: King’s College London, Guy’s Hospital
Natalie Zelenka: University of Bristol
Arun Prasad Pandurangan: Cambridge Biomedical Campus
Himani Tandon: Cambridge Biomedical Campus
Hashem Shihab: University of Bristol
Raju Kalaivani: Cambridge Biomedical Campus
Minkyung Sung: Cambridge Biomedical Campus
Adam J. Sardar: University of Bristol
Bastian Greshake Tzovoras: Université de Paris, INSERM U1284, Center for Research and Interdisciplinarity (CRI)
Davide Danovi: King’s College London, Guy’s Hospital
Julian Gough: Cambridge Biomedical Campus
Nature Communications, 2023, vol. 14, issue 1, 1-14
Abstract:
Abstract Cohort-wide sequencing studies have revealed that the largest category of variants is those deemed ‘rare’, even for the subset located in coding regions (99% of known coding variants are seen in less than 1% of the population. Associative methods give some understanding how rare genetic variants influence disease and organism-level phenotypes. But here we show that additional discoveries can be made through a knowledge-based approach using protein domains and ontologies (function and phenotype) that considers all coding variants regardless of allele frequency. We describe an ab initio, genetics-first method making molecular knowledge-based interpretations for exome-wide non-synonymous variants for phenotypes at the organism and cellular level. By using this reverse approach, we identify plausible genetic causes for developmental disorders that have eluded other established methods and present molecular hypotheses for the causal genetics of 40 phenotypes generated from a direct-to-consumer genotype cohort. This system offers a chance to extract further discovery from genetic data after standard tools have been applied.
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-023-36634-6 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-36634-6
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-023-36634-6
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().