An Overview and Evaluation of Recent Machine Learning Imputation Methods Using Cardiac Imaging Data
Yuzhe Liu and
Vanathi Gopalakrishnan
Additional contact information
Yuzhe Liu: Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15260, USA
Vanathi Gopalakrishnan: Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15260, USA
Data, 2017, vol. 2, issue 1, 1-15
Abstract:
Many clinical research datasets have a large percentage of missing values that directly impacts their usefulness in yielding high accuracy classifiers when used for training in supervised machine learning. While missing value imputation methods have been shown to work well with smaller percentages of missing values, their ability to impute sparse clinical research data can be problem specific. We previously attempted to learn quantitative guidelines for ordering cardiac magnetic resonance imaging during the evaluation for pediatric cardiomyopathy, but missing data significantly reduced our usable sample size. In this work, we sought to determine if increasing the usable sample size through imputation would allow us to learn better guidelines. We first review several machine learning methods for estimating missing data. Then, we apply four popular methods (mean imputation, decision tree, k-nearest neighbors, and self-organizing maps) to a clinical research dataset of pediatric patients undergoing evaluation for cardiomyopathy. Using Bayesian Rule Learning (BRL) to learn ruleset models, we compared the performance of imputation-augmented models versus unaugmented models. We found that all four imputation-augmented models performed similarly to unaugmented models. While imputation did not improve performance, it did provide evidence for the robustness of our learned models.
Keywords: missing value imputation; machine learning; decision tree imputation; k-nearest neighbors imputation; self-organizing map imputation (search for similar items in EconPapers)
JEL-codes: C8 C80 C81 C82 C83 (search for similar items in EconPapers)
Date: 2017
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
https://www.mdpi.com/2306-5729/2/1/8/pdf (application/pdf)
https://www.mdpi.com/2306-5729/2/1/8/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jdataj:v:2:y:2017:i:1:p:8-:d:88768
Access Statistics for this article
Data is currently edited by Ms. Cecilia Yang
More articles in Data from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().