A Bayesian zero‐inflated Dirichlet‐multinomial regression model for multivariate compositional count data
Matthew D. Koslovsky
Biometrics, 2023, vol. 79, issue 4, 3239-3251
Abstract:
The Dirichlet‐multinomial (DM) distribution plays a fundamental role in modern statistical methodology development and application. Recently, the DM distribution and its variants have been used extensively to model multivariate count data generated by high‐throughput sequencing technology in omics research due to its ability to accommodate the compositional structure of the data as well as overdispersion. A major limitation of the DM distribution is that it is unable to handle excess zeros typically found in practice which may bias inference. To fill this gap, we propose a novel Bayesian zero‐inflated DM model for multivariate compositional count data with excess zeros. We then extend our approach to regression settings and embed sparsity‐inducing priors to perform variable selection for high‐dimensional covariate spaces. Throughout, modeling decisions are made to boost scalability without sacrificing interpretability or imposing limiting assumptions. Extensive simulations and an application to a human gut microbiome dataset are presented to compare the performance of the proposed method to existing approaches. We provide an accompanying R package with a user‐friendly vignette to apply our method to other datasets.
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1111/biom.13853
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:biomet:v:79:y:2023:i:4:p:3239-3251
Ordering information: This journal article can be ordered from
http://www.blackwell ... bs.asp?ref=0006-341X
Access Statistics for this article
More articles in Biometrics from The International Biometric Society
Bibliographic data for series maintained by Wiley Content Delivery ().