Variant-specific inflation factors for assessing population stratification at the phenotypic variance level
Tamar Sofer (),
Xiuwen Zheng,
Cecelia A. Laurie,
Stephanie M. Gogarten,
Jennifer A. Brody,
Matthew P. Conomos,
Joshua C. Bis,
Timothy A. Thornton,
Adam Szpiro,
Jeffrey R. O’Connell,
Ethan M. Lange,
Yan Gao,
L. Adrienne Cupples,
Bruce M. Psaty and
Kenneth M. Rice
Additional contact information
Tamar Sofer: Brigham and Women’s Hospital
Xiuwen Zheng: University of Washington
Cecelia A. Laurie: University of Washington
Stephanie M. Gogarten: University of Washington
Jennifer A. Brody: Epidemiology, and Health Services, University of Washington
Matthew P. Conomos: University of Washington
Joshua C. Bis: University of Washington
Timothy A. Thornton: University of Washington
Adam Szpiro: University of Washington
Jeffrey R. O’Connell: University of Maryland School of Medicine
Ethan M. Lange: University of Colorado Anschutz Medical Campus
Yan Gao: University Mississippi Medical Center
L. Adrienne Cupples: Boston University School of Public Health
Bruce M. Psaty: Epidemiology, and Health Services, University of Washington
Kenneth M. Rice: University of Washington
Nature Communications, 2021, vol. 12, issue 1, 1-14
Abstract:
Abstract In modern Whole Genome Sequencing (WGS) epidemiological studies, participant-level data from multiple studies are often pooled and results are obtained from a single analysis. We consider the impact of differential phenotype variances by study, which we term ‘variance stratification’. Unaccounted for, variance stratification can lead to both decreased statistical power, and increased false positives rates, depending on how allele frequencies, sample sizes, and phenotypic variances vary across the studies that are pooled. We develop a procedure to compute variant-specific inflation factors, and show how it can be used for diagnosis of genetic association analyses on pooled individual level data from multiple studies. We describe a WGS-appropriate analysis approach, implemented in freely-available software, which allows study-specific variances and thereby improves performance in practice. We illustrate the variance stratification problem, its solutions, and the proposed diagnostic procedure, in simulations and in data from the Trans-Omics for Precision Medicine Whole Genome Sequencing Program (TOPMed), used in association tests for hemoglobin concentrations and BMI.
Date: 2021
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-021-23655-2 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:12:y:2021:i:1:d:10.1038_s41467-021-23655-2
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-021-23655-2
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().