EconPapers    
Economics at your fingertips  
 

Learning structured population models from data with WSINDy

Rainey Lyons, Vanja Dukic and David M Bortz

PLOS Computational Biology, 2025, vol. 21, issue 12, 1-22

Abstract: Characteristics of individuals in a population, such as age and size, play a key role in determining how populations change over time. In contexts of population dynamics, identifying effective model features, such as fecundity and mortality rates, is generally a complex and computationally intensive process, especially when the dynamics are heterogeneous across the population. In this work, we propose a Weak form Scientific Machine Learning-based method for selecting appropriate model ingredients from a library of scientifically feasible functions used to model structured populations. This paper presents extensions of the Weak form Sparse Identification of Nonlinear Dynamics (WSINDy) method to select the best-fitting ingredients from noisy time-series histogram data. This extension includes learning heterogeneous dynamics and also learning the boundary processes (such as birth) of the model directly from the data. We additionally incorporate a cross-validation method which helps fine tune the recovered boundary process hyperparameters to the data.Several test cases are considered, demonstrating the method’s performance for several standard models from population modeling, including age and size-structured models. Through these examples, we examine both the advantages and limitations of the method, with a particular focus on the distinguishability of terms in the library.Author summary: Physiological characteristics of individuals, such as age and size, play a key role in determining how populations change over time. Developing effective mathematical models to describe the population dynamics requires determining how vital rates, such as mortality and fertility rates, depend on the individuals’ state. In this work, we propose a method for selecting these state-dependent rates from a library of scientifically plausible options, using time-series population data. Our approach builds on and adapts an existing Weak form Scientific Machine Learning technique, originally developed for discovering underlying dynamical systems from data. We test the method using both artificial data, generated from known models, and real population data from a previous study. Through these case studies, we evaluate the method’s ability to recover the correct model ingredients and predict the future population distribution, even when the data are heavily corrupted by noise. We also examine the limitations of the approach, particularly in cases where different candidate terms produce similar effects and are therefore difficult to distinguish.

Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013742 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 13742&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1013742

DOI: 10.1371/journal.pcbi.1013742

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-12-14
Handle: RePEc:plo:pcbi00:1013742