Imputing Phylogenetic Trees Using Tropical Polytopes over the Space of Phylogenetic Trees
Ruriko Yoshida ()
Additional contact information
Ruriko Yoshida: Naval Postgraduate School, Monterey, CA 93943-5219, USA
Mathematics, 2023, vol. 11, issue 15, 1-10
Abstract:
When we apply comparative phylogenetic analyses to genome data, it poses a significant problem and challenge that some of the given species (or taxa) often have missing genes (i.e., data). In such a case, we have to impute a missing part of a gene tree from a sample of gene trees. In this short paper, we propose a novel method to infer the missing part of a phylogenetic tree using an analogue of a classical linear regression in the setting of tropical geometry. In our approach, we consider a tropical polytope, a convex hull with respect to the tropical metric closest to the data points. We show a condition that we can guarantee that an estimated tree from the method has at most a Robinson–Foulds (RF) distance of four from the ground truth, and computational experiments with simulated data and empirical data from Clavicipitaceae, which contains more than 4000 genes, show the method works well.
Keywords: missing information; phylogenomics; phylogenetics; tropical geometry (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/11/15/3419/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/15/3419/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:15:p:3419-:d:1211538
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().