EconPapers    
Economics at your fingertips  
 

DeepBiome: A Phylogenetic Tree Informed Deep Neural Network for Microbiome Data Analysis

Jing Zhai, Youngwon Choi, Xingyi Yang, Yin Chen, Kenneth Knox, Homer L. Twigg, Joong-Ho Won, Hua Zhou and Jin J. Zhou ()
Additional contact information
Jing Zhai: University of Arizona
Youngwon Choi: Seoul National University
Xingyi Yang: University of Arizona
Yin Chen: University of Arizona
Kenneth Knox: University of Arizona
Homer L. Twigg: Indiana University Medical Center
Joong-Ho Won: Seoul National University
Hua Zhou: University of California
Jin J. Zhou: University of California

Statistics in Biosciences, 2025, vol. 17, issue 1, No 10, 215 pages

Abstract: Abstract Evidence linking the microbiome to human health is rapidly growing. The microbiome profile has the potential as a novel predictive biomarker for many diseases. However, tables of bacterial counts are typically sparse, and bacteria are classified within a hierarchy of taxonomic levels, ranging from species to phylum. Existing tools focus on identifying microbiome associations at either the community level or a specific, pre-defined taxonomic level. Incorporating the evolutionary relationship between bacteria can enhance data interpretation. This approach allows for aggregating microbiome contributions, leading to more accurate and interpretable results. We present DeepBiome, a phylogeny-informed neural network architecture, to predict phenotypes from microbiome counts and uncover the microbiome–phenotype association network. It utilizes microbiome abundance as input and employs phylogenetic taxonomy to guide the neural network’s architecture. Leveraging phylogenetic information, DeepBiome reduces the need for extensive tuning of the deep learning architecture, minimizes overfitting, and, crucially, enables the visualization of the path from microbiome counts to disease. It is applicable to both regression and classification problems. Simulation studies and real-life data analysis have shown that DeepBiome is both highly accurate and efficient. It offers deep insights into complex microbiome–phenotype associations, even with small to moderate training sample sizes. In practice, the specific taxonomic level at which microbiome clusters tag the association remains unknown. Therefore, the main advantage of the presented method over other analytical methods is that it offers an ecological and evolutionary understanding of host–microbe interactions, which is important for microbiome-based medicine. DeepBiome is implemented using Python packages Keras and TensorFlow. It is an open-source tool available at https://github.com/Young-won/DeepBiome .

Keywords: Metagenomics; Phylogenetic tree; Neural networks; Prediction; Mixed taxonomic levels (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s12561-024-09434-9 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:stabio:v:17:y:2025:i:1:d:10.1007_s12561-024-09434-9

Ordering information: This journal article can be ordered from
http://www.springer.com/journal/12561

DOI: 10.1007/s12561-024-09434-9

Access Statistics for this article

Statistics in Biosciences is currently edited by Hongyu Zhao and Xihong Lin

More articles in Statistics in Biosciences from Springer, International Chinese Statistical Association
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-05-18
Handle: RePEc:spr:stabio:v:17:y:2025:i:1:d:10.1007_s12561-024-09434-9