EconPapers    
Economics at your fingertips  
 

Correcting for the effects of class imbalance improves the performance of machine-learning based species distribution models

Donald J. Benkendorf, Samuel D. Schwartz, D. Richard Cutler and Charles P. Hawkins

Ecological Modelling, 2023, vol. 483, issue C

Abstract: Numerous methods have been developed to combat the unwanted effects of imbalanced training data on the performance of machine-learning based predictive models. These methods attempt to balance model sensitivity and specificity. However, the effects of specific imbalance-correction methods on the performance of different machine-learning algorithms are not well understood for ecological data. In this study, we used four machine-learning algorithms (random forest, artificial neural network, gradient boosting, support vector machine) and five imbalance-correction methods (base algorithm = no correction, cutoff, up-sampling, down-sampling, weighting) to produce species distribution models for 15 freshwater macroinvertebrate genera that varied from 2.5 to 29.0% in prevalence. All imbalance-correction methods substantially improved average model performance (true skill statistic) over the base machine-learning algorithms, except when up-sampling was applied to random forest models. Choice of machine-learning algorithm had little effect on model performance, although gradient boosting performed better than other algorithms on the most imbalanced datasets. Our results suggest that the performance of species distribution models built with presence/absence data can generally be improved by correcting for imbalanced data.

Keywords: Species distribution models; Class imbalance; Prevalence; Machine-learning; Aquatic macroinvertebrates (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (4)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S030438002300145X
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:ecomod:v:483:y:2023:i:c:s030438002300145x

DOI: 10.1016/j.ecolmodel.2023.110414

Access Statistics for this article

Ecological Modelling is currently edited by Brian D. Fath

More articles in Ecological Modelling from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:ecomod:v:483:y:2023:i:c:s030438002300145x