EconPapers    
Economics at your fingertips  
 

A Zero-Inflated Logistic Normal Multinomial Model for Extracting Microbial Compositions

Yanyan Zeng, Daolin Pang, Hongyu Zhao and Tao Wang

Journal of the American Statistical Association, 2023, vol. 118, issue 544, 2356-2369

Abstract: High throughput sequencing data collected to study the microbiome provide information in the form of relative abundances and should be treated as compositions. Although many approaches including scaling and rarefaction have been proposed for converting raw count data into microbial compositions, most of these methods simply return zero values for zero counts. However, zeros can distort downstream analyses, and they can also pose problems for composition-aware methods. This problem is exacerbated with microbiome abundance data because they are sparse with excessive zeros. In addition to data sparsity, microbial composition estimation depends on other data characteristics such as high dimensionality, over-dispersion, and complex co-occurrence relationships. To address these challenges, we introduce a zero-inflated probabilistic PCA (ZIPPCA) model that accounts for the compositional nature of microbiome data, and propose an empirical Bayes approach to estimate microbial compositions. An efficient iterative algorithm, called classification variational approximation, is developed for carrying out maximum likelihood estimation. Moreover, we study the consistency and asymptotic normality of variational approximation estimator from the perspective of profile M-estimation. Extensive simulations and an application to a dataset from the Human Microbiome Project are presented to compare the performance of the proposed method with that of the existing methods. The method is implemented in R and available at https://github.com/YanyZeng/ZIPPCAlnm. Supplementary materials for this article are available online.

Date: 2023
References: Add references at CitEc
Citations:

Downloads: (external link)
http://hdl.handle.net/10.1080/01621459.2022.2044827 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:taf:jnlasa:v:118:y:2023:i:544:p:2356-2369

Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/UASA20

DOI: 10.1080/01621459.2022.2044827

Access Statistics for this article

Journal of the American Statistical Association is currently edited by Xuming He, Jun Liu, Joseph Ibrahim and Alyson Wilson

More articles in Journal of the American Statistical Association from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().

 
Page updated 2025-03-20
Handle: RePEc:taf:jnlasa:v:118:y:2023:i:544:p:2356-2369