A Comprehensive Performance Comparison Study of Various Statistical Models that Accommodate Challenges of the Gut Microbiome Data
Morteza Hajihosseini,
Payam Amini,
Alireza Saidi-Mehrabad,
Nastaran Hajizadeh,
Anita L. Kozyrskyj and
Irina Dinu ()
Additional contact information
Morteza Hajihosseini: University of Alberta
Payam Amini: Keele University
Alireza Saidi-Mehrabad: Division of Hydrological Sciences
Nastaran Hajizadeh: University of Alberta
Anita L. Kozyrskyj: University of Alberta
Irina Dinu: University of Alberta
Statistics in Biosciences, 2025, vol. 17, issue 1, No 11, 216-231
Abstract:
Abstract The human gut microbiome refers to trillions of symbiotic bacteria that colonize the human gut after birth, having an essential role in maintaining human health. Various factors can influence the human microbiome, delaying normal gut microbiota’s maturation and leading to the onset of various diseases. Therefore, studying gut microbiome composition offers evidence for early disease detection and intervention opportunities. Stool samples analyzed based on 16S ribosomal RNA via high-throughput sequencing technologies, usually result in the generation of a count table (number of reads) of detected species per sample in a form of amplicon sequence variants. The ASV count data has several inherent challenges, such as over-dispersion, within-samples correlation, and a large number of zeros. Appropriate statistical methods are necessary to measure the effect of important factors on the gut microbial community while addressing specific challenges inherent to the ASV counts. This paper compared the behavior of the most common statistical methods that accommodate the challenges of gut microbiome data in a comprehensive simulation study. Sixty-seven percent of our simulation scenarios indicate that Zero Inflated Negative Binomial model had a lower mean square error as compared to the other methods, and the zero-inflated gaussian mixture model had better statistical power. The real data application on the SKOT Cohorts dataset showed the effect of maternal obesity on the taxon abundance of infants at 9- and 18-months assessments. Our study showed that some of the more recent methods could adequately accommodate the challenges in the gut microbiome data without requiring data transformation or normalization.
Keywords: Gut microbiome; Zero-inflation; Over-dispersion; Correlation; The simulation study (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s12561-024-09435-8 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:stabio:v:17:y:2025:i:1:d:10.1007_s12561-024-09435-8
Ordering information: This journal article can be ordered from
http://www.springer.com/journal/12561
DOI: 10.1007/s12561-024-09435-8
Access Statistics for this article
Statistics in Biosciences is currently edited by Hongyu Zhao and Xihong Lin
More articles in Statistics in Biosciences from Springer, International Chinese Statistical Association
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().