EconPapers    
Economics at your fingertips  
 

Dissecting heritability, environmental risk, and air pollution causal effects using > 50 million individuals in MarketScan

Daniel McGuire, Havell Markus, Lina Yang, Jingyu Xu, Austin Montgomery, Arthur Berg, Qunhua Li, Laura Carrel, Dajiang J. Liu () and Bibo Jiang ()
Additional contact information
Daniel McGuire: Penn State College of Medicine
Havell Markus: Penn State College of Medicine of Medicine
Lina Yang: Penn State College of Medicine
Jingyu Xu: Penn State College of Medicine
Austin Montgomery: Penn State College of Medicine of Medicine
Arthur Berg: Penn State College of Medicine
Qunhua Li: Penn State University
Laura Carrel: Penn State College of Medicine
Dajiang J. Liu: Penn State College of Medicine
Bibo Jiang: Penn State College of Medicine

Nature Communications, 2024, vol. 15, issue 1, 1-14

Abstract: Abstract Large national-level electronic health record (EHR) datasets offer new opportunities for disentangling the role of genes and environment through deep phenotype information and approximate pedigree structures. Here we use the approximate geographical locations of patients as a proxy for spatially correlated community-level environmental risk factors. We develop a spatial mixed linear effect (SMILE) model that incorporates both genetics and environmental contribution. We extract EHR and geographical locations from 257,620 nuclear families and compile 1083 disease outcome measurements from the MarketScan dataset. We augment the EHR with publicly available environmental data, including levels of particulate matter 2.5 (PM2.5), nitrogen dioxide (NO2), climate, and sociodemographic data. We refine the estimates of genetic heritability and quantify community-level environmental contributions. We also use wind speed and direction as instrumental variables to assess the causal effects of air pollution. In total, we find PM2.5 or NO2 have statistically significant causal effects on 135 diseases, including respiratory, musculoskeletal, digestive, metabolic, and sleep disorders, where PM2.5 and NO2 tend to affect biologically distinct disease categories. These analyses showcase several robust strategies for jointly modeling genetic and environmental effects on disease risk using large EHR datasets and will benefit upcoming biobank studies in the era of precision medicine.

Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.nature.com/articles/s41467-024-49566-6 Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-49566-6

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-024-49566-6

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-19
Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-49566-6