EconPapers    
Economics at your fingertips  
 

Lossless integration of multiple electronic health records for identifying pleiotropy using summary statistics

Ruowang Li, Rui Duan, Xinyuan Zhang, Thomas Lumley, Sarah Pendergrass, Christopher Bauer, Hakon Hakonarson, David S. Carrell, Jordan W. Smoller, Wei-Qi Wei, Robert Carroll, Digna R. Velez Edwards, Georgia Wiesner, Patrick Sleiman, Josh C. Denny, Jonathan D. Mosley, Marylyn D. Ritchie, Yong Chen () and Jason H. Moore ()
Additional contact information
Ruowang Li: University of Pennsylvania
Rui Duan: Harvard T.H. Chan School of Public Health
Xinyuan Zhang: University of Pennsylvania
Thomas Lumley: University of Auckland
Sarah Pendergrass: Biomedical and Translational Informatics Institute
Christopher Bauer: Biomedical and Translational Informatics Institute
Hakon Hakonarson: Children’s Hospital of Philadelphia
David S. Carrell: Kaiser Permanente Washington Health Research Institute
Jordan W. Smoller: Massachusetts General Hospital
Wei-Qi Wei: Vanderbilt University Medical Centre
Robert Carroll: Vanderbilt University Medical Centre
Digna R. Velez Edwards: Vanderbilt University
Georgia Wiesner: Vanderbilt University
Patrick Sleiman: Children’s Hospital of Philadelphia
Josh C. Denny: Vanderbilt University Medical Centre
Jonathan D. Mosley: Vanderbilt University Medical Centre
Marylyn D. Ritchie: Department of Genetics, Perelman School of Medicine, University of Pennsylvania
Yong Chen: University of Pennsylvania
Jason H. Moore: University of Pennsylvania

Nature Communications, 2021, vol. 12, issue 1, 1-10

Abstract: Abstract Increasingly, clinical phenotypes with matched genetic data from bio-bank linked electronic health records (EHRs) have been used for pleiotropy analyses. Thus far, pleiotropy analysis using individual-level EHR data has been limited to data from one site. However, it is desirable to integrate EHR data from multiple sites to improve the detection power and generalizability of the results. Due to privacy concerns, individual-level patients’ data are not easily shared across institutions. As a result, we introduce Sum-Share, a method designed to efficiently integrate EHR and genetic data from multiple sites to perform pleiotropy analysis. Sum-Share requires only summary-level data and one round of communication from each site, yet it produces identical test statistics compared with that of pooled individual-level data. Consequently, Sum-Share can achieve lossless integration of multiple datasets. Using real EHR data from eMERGE, Sum-Share is able to identify 1734 potential pleiotropic SNPs for five cardiovascular diseases.

Date: 2021
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.nature.com/articles/s41467-020-20211-2 Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:12:y:2021:i:1:d:10.1038_s41467-020-20211-2

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-020-20211-2

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-19
Handle: RePEc:nat:natcom:v:12:y:2021:i:1:d:10.1038_s41467-020-20211-2