EconPapers    
Economics at your fingertips  
 

A Bayesian method for identifying associations between response variables and bacterial community composition

Adrian Verster, Nicholas Petronella, Judy Green, Fernando Matias and Stephen P J Brooks

PLOS Computational Biology, 2022, vol. 18, issue 7, 1-19

Abstract: Determining associations between intestinal bacteria and continuously measured physiological outcomes is important for understanding the bacteria-host relationship but is not straightforward since abundance data (compositional data) are not normally distributed. To address this issue, we developed a fully Bayesian linear regression model (BRACoD; Bayesian Regression Analysis of Compositional Data) with physiological measurements (continuous data) as a function of a matrix of relative bacterial abundances. Bacteria can be classified as operational taxonomic units or by taxonomy (genus, family, etc.). Bacteria associated with the physiological measurement were identified using a Bayesian variable selection method: Stochastic Search Variable Selection. The output is a list of inclusion probabilities (p^) and coefficients that indicate the strength of the association (β^included) for each bacterial taxa. Tests with simulated communities showed that adopting a cut point value of p^ ≥ 0.3 for identifying included bacteria optimized the true positive rate (TPR) while maintaining a false positive rate (FPR) of ≤ 5%. At this point, the chances of identifying non-contributing bacteria were low and all well-established contributors were included. Comparison with other methods showed that BRACoD (at p^ ≥ 0.3) had higher precision and a higher TPR than a commonly used center log transformed LASSO procedure (clr-LASSO) as well as higher TPR than an off-the-shelf Spike and Slab method after center log transformation (clr-SS). BRACoD was also less likely to include non-contributing bacteria that merely correlate with contributing bacteria. Analysis of a rat microbiome experiment identified 47 operational taxonomic units that contributed to fecal butyrate levels. Of these, 31 were positively and 16 negatively associated with butyrate. Consistent with their known role in butyrate metabolism, most of these fell within the Lachnospiraceae and Ruminococcaceae. We conclude that BRACoD provides a more precise and accurate method for determining bacteria associated with a continuous physiological outcome compared to clr-LASSO. It is more sensitive than a generalized clr-SS algorithm, although it has a higher FPR. Its ability to distinguish genuine contributors from correlated bacteria makes it better suited to discriminating bacteria that directly contribute to an outcome. The algorithm corrects for the distortions arising from compositional data making it appropriate for analysis of microbiome data.Author summary: We present a fully Bayesian linear regression model to identify associations between physiological measurements (continuous data) and intestinal bacteria (relative bacterial abundances; compositional data). The BRACoD (Bayesian Regression Analysis of Compositional Data) algorithm corrects for the compositional nature of the bacterial data to provide a list of inclusion probabilities and regression coefficients that indicate the strength of the association. If desired, the user can specify a cut point to select only the bacteria that meet predetermined performance characteristics. Analysis of a simulated dataset based on data from a rat microbiome study indicated that an inclusion probability cut point value ≥ 0.3 minimized the false positive rate while maintaining a reasonably high sensitivity (true positive rate). The identified associations form a starting point for generating hypotheses about the relationship between the gut microbial community and physiological outcomes.

Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010108 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 10108&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1010108

DOI: 10.1371/journal.pcbi.1010108

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-05-03
Handle: RePEc:plo:pcbi00:1010108