Comparing subsampling strategies for metagenomic analysis in microbial studies using amplicon sequence variants versus operational taxonomic units
Daniel Segura,
Divya Sharma and
Osvaldo Espin-Garcia
PLOS ONE, 2024, vol. 19, issue 12, 1-19
Abstract:
The microbiome is increasingly regarded as a key component of human health, and analysis of microbiome data can aid in the development of precision medicine. Due to the high cost of shotgun metagenomic sequencing (SM-seq), microbiome analyses can be done cost-effectively in two phases: Phase 1-sequencing of 16S ribosomal RNA, and Phase 2-SM-seq of an informative subsample. Existing research suggests strategies to select the subsample based on biological diversity and dissimilarity metrics calculated using operational taxonomic units (OTUs). However, the microbiome field has progressed towards amplicon sequencing variants (ASVs), as they provide more precise microbe identification and sample diversity information. The aim of this work is to compare the subsampling strategies for two-phase metagenomic studies when using ASVs instead of OTUs, and to propose data driven strategies for subsample selection through dimension reduction techniques. We used 199 samples of infant-gut microbiome data from the DIABIMMUNE project to generate ASVs and OTUs, then generated subsamples based on five existing biologically driven subsampling methods and two data driven methods. Linear discriminant analysis Effect Size (LEfSe) was used to assess differential representation of taxa between the subsamples and the overall sample. The use of ASVs showed a 50-93% agreement in the subsample selection with the use of OTUs for the subsampling methods evaluated, and showed a similar bacterial representation across all methods. Although sampling using ASVs and OTUs typically lead to similar results for each subsample, ASVs had more clades that differed in expression levels between allergic and non-allergic individuals across all sample sizes compared to OTUs, and led to more biomarkers discovered at Phase 2-SM-seq level.
Date: 2024
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0315720 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 15720&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0315720
DOI: 10.1371/journal.pone.0315720
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().