EconPapers    
Economics at your fingertips  
 

Iterative Usage of Fixed and Random Effect Models for Powerful and Efficient Genome-Wide Association Studies

Xiaolei Liu, Meng Huang, Bin Fan, Edward S Buckler and Zhiwu Zhang

PLOS Genetics, 2016, vol. 12, issue 2, 1-24

Abstract: False positives in a Genome-Wide Association Study (GWAS) can be effectively controlled by a fixed effect and random effect Mixed Linear Model (MLM) that incorporates population structure and kinship among individuals to adjust association tests on markers; however, the adjustment also compromises true positives. The modified MLM method, Multiple Loci Linear Mixed Model (MLMM), incorporates multiple markers simultaneously as covariates in a stepwise MLM to partially remove the confounding between testing markers and kinship. To completely eliminate the confounding, we divided MLMM into two parts: Fixed Effect Model (FEM) and a Random Effect Model (REM) and use them iteratively. FEM contains testing markers, one at a time, and multiple associated markers as covariates to control false positives. To avoid model over-fitting problem in FEM, the associated markers are estimated in REM by using them to define kinship. The P values of testing markers and the associated markers are unified at each iteration. We named the new method as Fixed and random model Circulating Probability Unification (FarmCPU). Both real and simulated data analyses demonstrated that FarmCPU improves statistical power compared to current methods. Additional benefits include an efficient computing time that is linear to both number of individuals and number of markers. Now, a dataset with half million individuals and half million markers can be analyzed within three days.Author Summary: Genome-Wide Association Studies (GWAS) can reveal genetic-phenotypic relationships, but have limitations. To control false positives, population structure and kinship are incorporated in a fixed and random effect Mixed Linear Model (MLM). However, because of the confounding between population structure, kinship, and quantitative trait nucleotides (QTNs), MLM leads to false negatives, missing some potentially important discoveries. Here, we present a new method, Fixed and random model Circulating Probability Unification (FarmCPU). FarmCPU performs marker tests with associated markers as covariates in a fixed effect model and optimization on the associated covariate markers in a random effect model separately. This process enables efficient computation, removes the confounding, prevents model over-fitting, and controls false positives simultaneously. FarmCPU controls false positives as well as MLM with reductions in both false negatives and computing times. Researchers will not only be able to analyze big data, but will also have greater success with fewer mistakes when mapping genes of interest.

Date: 2016
References: View complete reference list from CitEc
Citations: View citations in EconPapers (19)

Downloads: (external link)
https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1005767 (text/html)
https://journals.plos.org/plosgenetics/article/fil ... 05767&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pgen00:1005767

DOI: 10.1371/journal.pgen.1005767

Access Statistics for this article

More articles in PLOS Genetics from Public Library of Science
Bibliographic data for series maintained by plosgenetics ().

 
Page updated 2025-03-19
Handle: RePEc:plo:pgen00:1005767