The impact of genetic diversity statistics on model selection between coalescents

Freund, Fabian; Siri-Jégousse, Arno

The impact of genetic diversity statistics on model selection between coalescents

Fabian Freund and Arno Siri-Jégousse

Computational Statistics & Data Analysis, 2021, vol. 156, issue C

Abstract: Modeling genetic diversity needs an underlying genealogy model. To choose a fitting model based on genetic data, one can perform model selection between classes of genealogical trees, e.g. Kingman’s coalescent with exponential growth or multiple merger coalescents. Such selection can be based on many different statistics measuring genetic diversity. A random forest based Approximate Bayesian Computation is used to disentangle the effects of different statistics on distinguishing between various classes of genealogy models. For the specific question of inferring whether genealogies feature multiple mergers, a new statistic, the minimal observable clade size, is introduced. When combined with classical site frequency based statistics, it reduces classification errors considerably.

Keywords: Multiple merger; Exponential growth; Coalescent; Approximate Bayesian Computation; Genetic diversity statistics; Clade size (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167947320301468
Full text for ScienceDirect subscribers only.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:156:y:2021:i:c:s0167947320301468

DOI: 10.1016/j.csda.2020.107055

Access Statistics for this article

Computational Statistics & Data Analysis is currently edited by S.P. Azen

More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().