EconPapers    
Economics at your fingertips  
 

Effects of species prevalence on the performance of predictive models

Ratha Sor, Young-Seuk Park, Pieter Boets, Peter L.M. Goethals and Sovan Lek

Ecological Modelling, 2017, vol. 354, issue C, 11-19

Abstract: Predictive models are useful to support decision making, management and conservation planning. However, the performance of models varies across techniques and is affected by several factors including species prevalence (i.e. the occurrence rate of each species in the total samples). Here, we analysed and compared the performance of four common modelling techniques based on the species prevalence. The occurrence of macroinvertebrates collected at 63 sites along the Lower Mekong Basin was predicted using Logistic Regression, Random Forest, Support Vector Machine and Artificial Neural Network (ANN). Model performance was evaluated using Cohen’s Kappa Statistic (Kappa), area under receiver operating characteristic curve (AUC) and error rate. We found a highly significant quadratic effect of species prevalence on the four modelling techniques’ performance. Kappa and AUC were less depended on the species prevalence, making them a better measure. The best performance (Kappa and AUC) was reached when predicting species with an intermediate prevalence (e.g. 0.4–0.6). The four modelling techniques significantly yielded different performances (p<0.01), of which ANN performed generally better when using the complete prevalence range (i.e. 0.0–1.0) and the lower prevalence range (i.e. <0.1). However, the four techniques similarly performed when predicting species with a higher prevalence range (i.e. ≥0.3). Our results provide useful insights into the application of modelling techniques in predicting species occurrence and how their performance varies for species with different prevalence ranges. We suggest that the selection of appropriate modelling techniques should carefully take into account the species prevalence, particularly in the case of rare and generalist species.

Keywords: Quadratic effect; Species occurrence; Logistic regression; Random forest; Artificial neural network; Support vector machine; Macroinvertebrates; Habitat suitability; Mekong river (search for similar items in EconPapers)
Date: 2017
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (4)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0304380016304367
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:ecomod:v:354:y:2017:i:c:p:11-19

DOI: 10.1016/j.ecolmodel.2017.03.006

Access Statistics for this article

Ecological Modelling is currently edited by Brian D. Fath

More articles in Ecological Modelling from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:ecomod:v:354:y:2017:i:c:p:11-19