EconPapers    
Economics at your fingertips  
 

Robust and Powerful Differential Composition Tests for Clustered Microbiome Data

Zheng-Zheng Tang () and Guanhua Chen
Additional contact information
Zheng-Zheng Tang: University of Wisconsin-Madison, and Wisconsin Institute for Discovery
Guanhua Chen: University of Wisconsin-Madison

Statistics in Biosciences, 2021, vol. 13, issue 2, No 2, 200-216

Abstract: Abstract Thanks to advances in high-throughput sequencing technologies, the importance of microbiome to human health and disease has been increasingly recognized. Analyzing microbiome data from sequencing experiments is challenging due to their unique features such as compositional data, excessive zero observations, overdispersion, and complex relations among microbial taxa. Clustered microbiome data have become prevalent in recent years from designs such as longitudinal studies, family studies, and matched case–control studies. The within-cluster dependence compounds the challenge of the microbiome data analysis. Methods that properly accommodate intra-cluster correlation and features of the microbiome data are needed. We develop robust and powerful differential composition tests for clustered microbiome data. The methods do not rely on any distributional assumptions on the microbial compositions, which provides flexibility to model various correlation structures among taxa and among samples within a cluster. By leveraging the adjusted sandwich covariance estimate, the methods properly accommodate sample dependence within a cluster. The two-part version of the test can further improve power in the presence of excessive zero observations. Different types of confounding variables can be easily adjusted for in the methods. We perform extensive simulation studies under commonly adopted clustered data designs to evaluate the methods. We demonstrate that the methods properly control the type I error under all designs and are more powerful than existing methods in many scenarios. The usefulness of the proposed methods is further demonstrated with two real datasets from longitudinal microbiome studies on pregnant women and inflammatory bowel disease patients. The methods have been incorporated into the R package “miLineage” publicly available at https://tangzheng1.github.io/tanglab/software.html .

Keywords: Microbiome composition; Clustered data; Association tests; Zero-inflation; Distribution-free (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s12561-019-09251-5 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:stabio:v:13:y:2021:i:2:d:10.1007_s12561-019-09251-5

Ordering information: This journal article can be ordered from
http://www.springer.com/journal/12561

DOI: 10.1007/s12561-019-09251-5

Access Statistics for this article

Statistics in Biosciences is currently edited by Hongyu Zhao and Xihong Lin

More articles in Statistics in Biosciences from Springer, International Chinese Statistical Association
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:stabio:v:13:y:2021:i:2:d:10.1007_s12561-019-09251-5