EconPapers    
Economics at your fingertips  
 

Estimating effective population size changes from preferentially sampled genetic sequences

Michael D Karcher, Luiz Max Carvalho, Marc A Suchard, Gytis Dudas and Vladimir N Minin

PLOS Computational Biology, 2020, vol. 16, issue 10, 1-22

Abstract: Coalescent theory combined with statistical modeling allows us to estimate effective population size fluctuations from molecular sequences of individuals sampled from a population of interest. When sequences are sampled serially through time and the distribution of the sampling times depends on the effective population size, explicit statistical modeling of sampling times improves population size estimation. Previous work assumed that the genealogy relating sampled sequences is known and modeled sampling times as an inhomogeneous Poisson process with log-intensity equal to a linear function of the log-transformed effective population size. We improve this approach in two ways. First, we extend the method to allow for joint Bayesian estimation of the genealogy, effective population size trajectory, and other model parameters. Next, we improve the sampling time model by incorporating additional sources of information in the form of time-varying covariates. We validate our new modeling framework using a simulation study and apply our new methodology to analyses of population dynamics of seasonal influenza and to the recent Ebola virus outbreak in West Africa.Author summary: Estimating changes in the number of individuals in a given population is a challenging problem in some settings. For example, estimating population size trajectories of the number of people infected by a pathogen (e.g., Influenza virus) is a difficult problem, because many infections in a large population remain unobserved/hidden. One indirect way of assessing population size changes is to take a sample of individuals from the population of interest and analyze genetic sequences from these individuals (e.g., Influenza virus genomes). Intuitively, genetic data is informative about population size changes, because genetic diversity increases/decreases together with the population size. However, if we sample more individuals when the population size increases and less when it decreases, this strategy produces biased results. To avoid this bias, we propose a method that explicitly and flexibly models potential dependency of genetic sequence sampling on the population size. An added bonus of this new modeling framework is more precise estimation of population size changes. We demonstrate strengths of our new methodology on simulated data and on genetic sequences of Influenza and Ebola viruses.

Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007774 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 07774&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1007774

DOI: 10.1371/journal.pcbi.1007774

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-03-22
Handle: RePEc:plo:pcbi00:1007774