A genomic mutational constraint map using variation in 76,156 human genomes
Siwei Chen (),
Laurent C. Francioli,
Julia K. Goodrich,
Ryan L. Collins,
Masahiro Kanai,
Qingbo Wang,
Jessica Alföldi,
Nicholas A. Watts,
Christopher Vittal,
Laura D. Gauthier,
Timothy Poterba,
Michael W. Wilson,
Yekaterina Tarasova,
William Phu,
Riley Grant,
Mary T. Yohannes,
Zan Koenig,
Yossi Farjoun,
Eric Banks,
Stacey Donnelly,
Stacey Gabriel,
Namrata Gupta,
Steven Ferriera,
Charlotte Tolonen,
Sam Novod,
Louis Bergelson,
David Roazen,
Valentin Ruano-Rubio,
Miguel Covarrubias,
Christopher Llanwarne,
Nikelle Petrillo,
Gordon Wade,
Thibault Jeandet,
Ruchi Munshi,
Kathleen Tibbetts,
Anne O’Donnell-Luria,
Matthew Solomonson,
Cotton Seed,
Alicia R. Martin,
Michael E. Talkowski,
Heidi L. Rehm,
Mark J. Daly,
Grace Tiao,
Benjamin M. Neale,
Daniel G. MacArthur and
Konrad J. Karczewski ()
Additional contact information
Siwei Chen: Broad Institute of MIT and Harvard
Laurent C. Francioli: Broad Institute of MIT and Harvard
Julia K. Goodrich: Broad Institute of MIT and Harvard
Ryan L. Collins: Broad Institute of MIT and Harvard
Masahiro Kanai: Broad Institute of MIT and Harvard
Qingbo Wang: Broad Institute of MIT and Harvard
Jessica Alföldi: Broad Institute of MIT and Harvard
Nicholas A. Watts: Broad Institute of MIT and Harvard
Christopher Vittal: Broad Institute of MIT and Harvard
Laura D. Gauthier: Broad Institute of MIT and Harvard
Timothy Poterba: Broad Institute of MIT and Harvard
Michael W. Wilson: Broad Institute of MIT and Harvard
Yekaterina Tarasova: Broad Institute of MIT and Harvard
William Phu: Broad Institute of MIT and Harvard
Riley Grant: Broad Institute of MIT and Harvard
Mary T. Yohannes: Broad Institute of MIT and Harvard
Zan Koenig: Massachusetts General Hospital
Yossi Farjoun: Lady Davis Institute
Eric Banks: Broad Institute of MIT and Harvard
Stacey Donnelly: Broad Institute of MIT and Harvard
Stacey Gabriel: Broad Institute of MIT and Harvard
Namrata Gupta: Broad Institute of MIT and Harvard
Steven Ferriera: Broad Institute of MIT and Harvard
Charlotte Tolonen: Broad Institute of MIT and Harvard
Sam Novod: Broad Institute of MIT and Harvard
Louis Bergelson: Broad Institute of MIT and Harvard
David Roazen: Broad Institute of MIT and Harvard
Valentin Ruano-Rubio: Broad Institute of MIT and Harvard
Miguel Covarrubias: Broad Institute of MIT and Harvard
Christopher Llanwarne: Broad Institute of MIT and Harvard
Nikelle Petrillo: Broad Institute of MIT and Harvard
Gordon Wade: Broad Institute of MIT and Harvard
Thibault Jeandet: Broad Institute of MIT and Harvard
Ruchi Munshi: Broad Institute of MIT and Harvard
Kathleen Tibbetts: Broad Institute of MIT and Harvard
Anne O’Donnell-Luria: Broad Institute of MIT and Harvard
Matthew Solomonson: Broad Institute of MIT and Harvard
Cotton Seed: Massachusetts General Hospital
Alicia R. Martin: Broad Institute of MIT and Harvard
Michael E. Talkowski: Broad Institute of MIT and Harvard
Heidi L. Rehm: Broad Institute of MIT and Harvard
Mark J. Daly: Broad Institute of MIT and Harvard
Grace Tiao: Broad Institute of MIT and Harvard
Benjamin M. Neale: Broad Institute of MIT and Harvard
Daniel G. MacArthur: Broad Institute of MIT and Harvard
Konrad J. Karczewski: Broad Institute of MIT and Harvard
Nature, 2024, vol. 625, issue 7993, 92-100
Abstract:
Abstract The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders1–4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)—the largest public open-access human genome allele frequency reference dataset—and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation.
Date: 2024
References: Add references at CitEc
Citations: View citations in EconPapers (5)
Downloads: (external link)
https://www.nature.com/articles/s41586-023-06045-0 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:nature:v:625:y:2024:i:7993:d:10.1038_s41586-023-06045-0
Ordering information: This journal article can be ordered from
https://www.nature.com/
DOI: 10.1038/s41586-023-06045-0
Access Statistics for this article
Nature is currently edited by Magdalena Skipper
More articles in Nature from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().