ARTreeFormer: A faster attention-based autoregressive model for phylogenetic inference
Tianyu Xie,
Yicong Mao and
Cheng Zhang
PLOS Computational Biology, 2025, vol. 21, issue 12, 1-22
Abstract:
Probabilistic modeling over the combinatorially large space of tree topologies remains a central challenge in phylogenetic inference. Previous approaches often necessitate pre-sampled tree topologies, limiting their modeling capability to a subset of the entire tree space. A recent advancement is ARTree, a deep autoregressive model that offers unrestricted distributions for tree topologies. However, its reliance on repetitive tree traversals and inefficient local message passing for computing topological node representations may hamper the scalability to large datasets. This paper proposes ARTreeFormer, a novel approach that harnesses fixed-point iteration and attention mechanisms to accelerate ARTree. By introducing a fixed-point iteration algorithm for computing the topological node embeddings, ARTreeFormer allows fast vectorized computation, especially on CUDA devices. This, together with an attention-based global message passing scheme, significantly improves the computation speed of ARTree while maintaining great approximation performance. We demonstrate the effectiveness and efficiency of our method on a benchmark of challenging real data phylogenetic inference problems.Author summary: Our research introduces novel methods for probabilistic modeling over phylogenetic tree topologies that are useful for various phylogenetic inference tasks such as tree probability estimation and variational Bayesian phylogenetic inference. Our model is based on ARTree, but achieves better scalability by leveraging a fixed-point algorithm for solving linear systems and employing an expressive attention-based architecture that captures long-range dependencies between edges. On benchmark phylogenetic inference datasets, including one with one hundred taxa, we demonstrate that the new model significantly reduces the computational cost of ARTree, while achieving similar or better performance in density estimation and variational approximation. This work represents an important step toward scaling up variational inference for Bayesian phylogenetics. The techniques introduced here may also inspire future advances in scalable phylogenetic modeling.
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013768 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 13768&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1013768
DOI: 10.1371/journal.pcbi.1013768
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().