An Improved Expectation–Maximization Bayesian Algorithm for GWAS
Ganwen Zhang,
Jianini Zhao,
Jieru Wang,
Guo Lin,
Lin Li,
Fengfei Ban,
Meiting Zhu,
Yangjun Wen () and
Jin Zhang ()
Additional contact information
Ganwen Zhang: College of Science, Nanjing Agricultural University, Nanjing 210095, China
Jianini Zhao: College of Science, Nanjing Agricultural University, Nanjing 210095, China
Jieru Wang: College of Science, Nanjing Agricultural University, Nanjing 210095, China
Guo Lin: College of Science, Nanjing Agricultural University, Nanjing 210095, China
Lin Li: College of Science, Nanjing Agricultural University, Nanjing 210095, China
Fengfei Ban: College of Science, Nanjing Agricultural University, Nanjing 210095, China
Meiting Zhu: College of Science, Nanjing Agricultural University, Nanjing 210095, China
Yangjun Wen: College of Science, Nanjing Agricultural University, Nanjing 210095, China
Jin Zhang: College of Science, Nanjing Agricultural University, Nanjing 210095, China
Mathematics, 2024, vol. 12, issue 13, 1-14
Abstract:
Genome-wide association studies (GWASs) are flexible and comprehensive tools for identifying single nucleotide polymorphisms (SNPs) associated with complex traits or diseases. The whole-genome Bayesian models are an effective way of incorporating important prior information into modeling. Bayesian methods have been widely used in association analysis. However, Bayesian analysis is often not feasible due to the high-throughput genotype and large sample sizes involved. In this study, we propose a new Bayesian algorithm under the mixed linear model framework: the expectation and maximization BayesB Improved algorithm (emBBI). The emBBI algorithm corrects polygenic and environmental noise and reduces dimensions; then, it estimates and tests marker effects using emBayesB and the LOD test, respectively. We conducted two simulation experiments and analyzed a real dataset related to flowering time in Arabidopsis to demonstrate the validation of the new algorithm. The results show that the emBBI algorithm is more flexible and accurate in simulation studies compared to established methods, and it performs well under complex genetic backgrounds. The analysis of the Arabidopsis real dataset further illustrates the advantages of the emBBI algorithm for GWAS by detecting known genes. Furthermore, 12 candidate genes are identified in the neighborhood of the significant quantitative trait nucleotides (QTNs) of flowering-related QTNs in Arabidopsis . In addition, we also performed enrichment analysis and tissue expression analysis of candidate genes, which will help us better understand the genetic basis of flowering-related traits in Arabidopsis .
Keywords: GAWS; Bayesian method; mixed linear model; candidate gene (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/12/13/1944/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/13/1944/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:13:p:1944-:d:1420475
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().