Classifying highly imbalanced ICU data
Yazan Roumani (),
Jerrold May,
David Strum and
Luis Vargas
Health Care Management Science, 2013, vol. 16, issue 2, 119-128
Abstract:
Highly imbalanced data sets are those where the class of interest is rare. In this paper, we compare the performance of several common data mining methods, logistic regression, discriminant analysis, Classification and Regression Tree (CART) models, C5, and Support Vector Machines (SVM) in predicting the discharge status (alive or deceased, with “deceased” being the class of interest) of patients from an Intensive Care Unit (ICU). Using a variety of misclassification cost ratio (MCR) values and using specificity, recall, precision, the F-measure, and confusion entropy (CEN) as criteria for evaluating each method’s performance, C5 and SVM performed better than the other methods. At a MCR of 100, C5 had the highest recall and SVM the highest specificity and lowest CEN. We also used Hand’s measure to compare the five methods. According to Hand’s measure, logistic regression performed the best. This article makes several contributions. We show how the use of MCR for analyzing imbalanced medical data significantly improves the method’s classification performance. We also found that the F-measure and precision did not improve as the MCR was increased. Copyright Springer Science+Business Media New York 2013
Keywords: Data mining; Imbalanced data; Misclassification cost; Hand’s measure; Intensive Care Unit (ICU) (search for similar items in EconPapers)
Date: 2013
References: View complete reference list from CitEc
Citations: View citations in EconPapers (4)
Downloads: (external link)
http://hdl.handle.net/10.1007/s10729-012-9216-9 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:kap:hcarem:v:16:y:2013:i:2:p:119-128
Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10729
DOI: 10.1007/s10729-012-9216-9
Access Statistics for this article
Health Care Management Science is currently edited by Yasar Ozcan
More articles in Health Care Management Science from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().