EconPapers    
Economics at your fingertips  
 

Modeling Longitudinal Microbiome Compositional Data: A Two-Part Linear Mixed Model with Shared Random Effects

Yongli Han (), Courtney Baker, Emily Vogtmann, Xing Hua, Jianxin Shi and Danping Liu ()
Additional contact information
Yongli Han: National Cancer Institute
Courtney Baker: University of North Carolina
Emily Vogtmann: National Cancer Institute
Xing Hua: National Cancer Institute
Jianxin Shi: National Cancer Institute
Danping Liu: National Cancer Institute

Statistics in Biosciences, 2021, vol. 13, issue 2, No 4, 243-266

Abstract: Abstract Longitudinal microbiome studies have been widely used to unveil the dynamics in the complex host-microbial ecosystems. Modeling the longitudinal microbiome compositional data, which is semi-continuous in nature, is challenging in several aspects: the overabundance of zeros, the heavy skewness of non-zero values that are bounded in (0, 1), and the dependence between the binary and non-zero parts. To deal with these challenges, we first extended the work of Chen and Li [1] and proposed a two-part zero-inflated Beta regression model with shared random effects (ZIBR-SRE), which characterize the dependence between the binary and the continuous parts. Besides, the microbiome compositional data have unit-sum constraint, indicating the existence of negative correlations among taxa. As ZIBR-SRE models each taxon separately, it does not satisfy the sum-to-one constraint. We then proposed a two-part linear mixed model (TPLMM) with shared random effects to formulate the log-transformed standardized relative abundances rather than the original ones. Such transformation is called “additive logistic transformation”, initially developed for cross-sectional compositional data. We extended it to analyze the longitudinal microbiome compositions and showed that the unit-sum constraint can be automatically satisfied under the TPLMM framework. Model performances of TPLMM and ZIBR-SRE were compared with existing methods in simulation studies. Under settings adopted from real data, TPLMM had the best performance and is recommended for practical use. An oral microbiome application further showed that TPLMM and ZIBR-SRE estimated a strong correlation structure in the binary and the continuous parts, suggesting models without accounting for this dependence would lead to biased inferences.

Keywords: Longitudinal analysis; Microbiome compositional data; Shared random effects; Standardized relative abundance; Unit-sum constraint (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s12561-021-09302-w Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:stabio:v:13:y:2021:i:2:d:10.1007_s12561-021-09302-w

Ordering information: This journal article can be ordered from
http://www.springer.com/journal/12561

DOI: 10.1007/s12561-021-09302-w

Access Statistics for this article

Statistics in Biosciences is currently edited by Hongyu Zhao and Xihong Lin

More articles in Statistics in Biosciences from Springer, International Chinese Statistical Association
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:stabio:v:13:y:2021:i:2:d:10.1007_s12561-021-09302-w