Finding the Centre: Compositional Asymmetry in High-Throughput Sequencing Datasets
Jia R. Wu (),
Jean M. Macklaim (),
Briana L. Genge () and
Gregory B. Gloor ()
Additional contact information
Jia R. Wu: University of Waterloo, Department of Computer Science
Jean M. Macklaim: Department of Biochemistry
Briana L. Genge: Department of Biochemistry
Gregory B. Gloor: Department of Biochemistry
A chapter in Advances in Compositional Data Analysis, 2021, pp 329-346 from Springer
Abstract:
Abstract High-throughput sequencing datasets comprise millions of reads of genomic data and can be modelled as count compositions. These data are used for transcription profiles, microbial diversity, or relative cellular abundance in culture. The data are sparse and high dimensional. Moreover, they are often unbalanced, i.e. there is often systematic variation between groups due to presence or absence of features, and this variation is important to the biological interpretation of the data. The imbalance causes samples in the comparison groups to exhibit varying centres contributing to false positive and false negative identifications. Here, we extend the centred log-ratio transformation method used for the comparison of differential relative abundance between two groups in a Bayesian compositional context. We demonstrate the pathology in modelled and real unbalanced experimental designs to show how this causes both false negative and false positive inference. We examined four approaches to identify denominator features, and tested them with different proportions of modelled asymmetry; two were relatively robust, and recommended. We recommend the ‘LVHA’ transformation for asymmetric transcriptome datasets, and the ‘IQLR’ method for all other datasets when using the ALDEx2 tool available on Bioconductor.
Date: 2021
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:sprchp:978-3-030-71175-7_17
Ordering information: This item can be ordered from
http://www.springer.com/9783030711757
DOI: 10.1007/978-3-030-71175-7_17
Access Statistics for this chapter
More chapters in Springer Books from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().