Inference of trajectory presence by tree dimension and subset specificity by subtree cover
Lovemore Tenha and
Mingzhou Song
PLOS Computational Biology, 2022, vol. 18, issue 2, 1-20
Abstract:
The complexity of biological processes such as cell differentiation is reflected in dynamic transitions between cellular states. Trajectory inference arranges the states into a progression using methodologies propelled by single-cell biology. However, current methods, all returning a best trajectory, do not adequately assess statistical significance of noisy patterns, leading to uncertainty in inferred trajectories. We introduce a tree dimension test for trajectory presence in multivariate data by a dimension measure of Euclidean minimum spanning tree, a test statistic, and a null distribution. Computable in linear time to tree size, the tree dimension measure summarizes the extent of branching more effectively than globally insensitive number of leaves or tree diameter indifferent to secondary branches. The test statistic quantifies trajectory presence and its null distribution is estimated under the null hypothesis of no trajectory in data. On simulated and real single-cell datasets, the test outperformed the intuitive number of leaves and tree diameter statistics. Next, we developed a measure for the tissue specificity of the dynamics of a subset, based on the minimum subtree cover of the subset in a minimum spanning tree. We found that tissue specificity of pathway gene expression dynamics is conserved in human and mouse development: several signal transduction pathways including calcium and Wnt signaling are most tissue specific, while genetic information processing pathways such as ribosome and mismatch repair are least so. Neither the tree dimension test nor the subset specificity measure has any user parameter to tune. Our work opens a window to prioritize cellular dynamics and pathways in development and other multivariate dynamical systems.Author summary: Modern biology now routinely studies transcriptome profiles during development. This practice demands computational methods to quantify dynamical changes in cellular states and their heterogeneity. Many methods process single-cell transcriptome data to reconstruct cellular trajectories, which are orderings of cells as they progress from an early to a late developmental stage. Due to noise in transcriptome data, there is a great need to quantify how likely observed data present a trajectory-like pattern due to chance. To address this need, we developed a tree dimension test to quantify evidence for trajectory presence in multivariate data based on graph-theoretical concepts. By this test, one may reject trajectory presence due to low data quality, or accept a trajectory with high statistical significance. Now one can rank biological pathways by their trajectory quality. We also introduce a subset specificity measure to quantify how cellular or pathway dynamics are tissue specific. We found that pathway tissue specificity is highly conserved between human and mouse. Trajectory presence testing and subset specificity offer a unique informatics tool set to study developmental biology.
Date: 2022
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1009829 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 09829&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1009829
DOI: 10.1371/journal.pcbi.1009829
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().