Machine-learning classifiers for imbalanced tornado data
Theodore Trafalis (),
Indra Adrianto (),
Michael Richman () and
S. Lakshmivarahan ()
Computational Management Science, 2014, vol. 11, issue 4, 403-418
Abstract:
Learning from imbalanced data, where the number of observations in one class is significantly larger than the ones in the other class, has gained considerable attention in the machine learning community. Assuming the difficulty in predicting each class is similar, most standard classifiers will tend to predict the majority class well. This study applies tornado data that are highly imbalanced, as they are rare events. The severe weather data used herein have thunderstorm circulations (mesocyclones) that produce tornadoes in approximately 6.7 % of the total number of observations. However, since tornadoes are high impact weather events, it is important to predict the minority class with high accuracy. In this study, we apply support vector machines (SVMs) and logistic regression with and without a midpoint threshold adjustment on the probabilistic outputs, random forest, and rotation forest for tornado prediction. Feature selection with SVM-recursive feature elimination was also performed to identify the most important features or variables for predicting tornadoes. The results showed that the threshold adjustment on SVMs provided better performance compared to other classifiers. Copyright Springer-Verlag Berlin Heidelberg 2014
Keywords: Machine learning; Support vector machines; Random forest; Rotation forest; Logistic regression; Tornado detection; 62H30; 68Q32; 62J86 (search for similar items in EconPapers)
Date: 2014
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
http://hdl.handle.net/10.1007/s10287-013-0174-6 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:comgts:v:11:y:2014:i:4:p:403-418
Ordering information: This journal article can be ordered from
http://www.springer. ... ch/journal/10287/PS2
DOI: 10.1007/s10287-013-0174-6
Access Statistics for this article
Computational Management Science is currently edited by Ruediger Schultz
More articles in Computational Management Science from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().