EconPapers    
Economics at your fingertips  
 

Mining ICDDR, B Hospital Surveillance Data and Exhibiting Strategies for Balancing Large Unbalanced Datasets

Adnan Firoze and Rashedur M. Rahman
Additional contact information
Adnan Firoze: School of Engineering and Applied Science (SEAS), Columbia University, New York City, NY, USA
Rashedur M. Rahman: Department of Electrical and Computer Engineering, North South University, Dhaka, Bangladesh

International Journal of Healthcare Information Systems and Informatics (IJHISI), 2015, vol. 10, issue 1, 39-66

Abstract: This research uses a number of classifier models on Hospital Surveillance data to classify admitted patients according to their critical conditions. Three class labels were used to distinguish the criticality of the admitted patients. Furthermore, set forth are two distinct approaches to address the over-fitting problem in the unbalanced dataset since the frequency of instances of the class ‘low' is significantly higher than the other two classes. Apart from trimming the dataset to balance the classes, this work has dealt with the over-fitting problem by introducing the ‘Synthetic Minority Over-sampling Technique' (SMOTE) algorithm coupled with Locally Linear Embedding (LLE). It has constructed three models that applied the neural, and multinomial logistic regression classifications and finally compared the performance of the work's models with the models developed by Rahman and Hasan (2011) where they used several decision tree models to classify the same dataset using tenfold cross validation. Additionally, for a comprehensive comparative analysis, this work has compared the classification performance of the authors' novel third model using support vector machine (SVM). After comparison, the work shows that one of the authors' models surpasses all prior models in terms of classification performance, taking into account the performance time trade-off, giving them an efficient model that handles large scale unbalanced datasets efficiently with standard classification performance. The models developed in this research can become imperative tools to doctors when large numbers of patients arrive in a short interval especially during epidemics. Since, intervention of machines become a necessity when doctors are scarce, computer applications powered by these models are helpful to diagnose and measure the criticality of the newly arrived patients with the help of the historical data kept in the surveillance database.

Date: 2015
References: Add references at CitEc
Citations:

Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 18/IJHISI.2015010103 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:igg:jhisi0:v:10:y:2015:i:1:p:39-66

Access Statistics for this article

International Journal of Healthcare Information Systems and Informatics (IJHISI) is currently edited by Qiang (Shawn) Cheng

More articles in International Journal of Healthcare Information Systems and Informatics (IJHISI) from IGI Global
Bibliographic data for series maintained by Journal Editor ().

 
Page updated 2025-03-19
Handle: RePEc:igg:jhisi0:v:10:y:2015:i:1:p:39-66