EconPapers    
Economics at your fingertips  
 

ARTreeFormer: A faster attention-based autoregressive model for phylogenetic inference

Tianyu Xie, Yicong Mao and Cheng Zhang

PLOS Computational Biology, 2025, vol. 21, issue 12, 1-22

Abstract: Probabilistic modeling over the combinatorially large space of tree topologies remains a central challenge in phylogenetic inference. Previous approaches often necessitate pre-sampled tree topologies, limiting their modeling capability to a subset of the entire tree space. A recent advancement is ARTree, a deep autoregressive model that offers unrestricted distributions for tree topologies. However, its reliance on repetitive tree traversals and inefficient local message passing for computing topological node representations may hamper the scalability to large datasets. This paper proposes ARTreeFormer, a novel approach that harnesses fixed-point iteration and attention mechanisms to accelerate ARTree. By introducing a fixed-point iteration algorithm for computing the topological node embeddings, ARTreeFormer allows fast vectorized computation, especially on CUDA devices. This, together with an attention-based global message passing scheme, significantly improves the computation speed of ARTree while maintaining great approximation performance. We demonstrate the effectiveness and efficiency of our method on a benchmark of challenging real data phylogenetic inference problems.Author summary: Our research introduces novel methods for probabilistic modeling over phylogenetic tree topologies that are useful for various phylogenetic inference tasks such as tree probability estimation and variational Bayesian phylogenetic inference. Our model is based on ARTree, but achieves better scalability by leveraging a fixed-point algorithm for solving linear systems and employing an expressive attention-based architecture that captures long-range dependencies between edges. On benchmark phylogenetic inference datasets, including one with one hundred taxa, we demonstrate that the new model significantly reduces the computational cost of ARTree, while achieving similar or better performance in density estimation and variational approximation. This work represents an important step toward scaling up variational inference for Bayesian phylogenetics. The techniques introduced here may also inspire future advances in scalable phylogenetic modeling.

Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013768 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 13768&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1013768

DOI: 10.1371/journal.pcbi.1013768

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-12-07
Handle: RePEc:plo:pcbi00:1013768