Extending cluster-based ensemble learning through synthetic population generation for modeling disparities in health insurance coverage across Missouri
Erik D. Mueller (),
J. S. Onésimo Sandoval,
Srikanth P. Mudigonda and
Michael Elliott
Additional contact information
Erik D. Mueller: Saint Louis University
J. S. Onésimo Sandoval: Saint Louis University
Srikanth P. Mudigonda: Saint Louis University
Michael Elliott: Saint Louis University
Journal of Computational Social Science, 2019, vol. 2, issue 2, No 9, 291 pages
Abstract:
Abstract In a previous study, Mueller et al. (ISPRS Int J Geo-Inf 8(1):13, 2019), presented a machine learning ensemble algorithm using K-means clustering as a preprocessing technique to increase predictive modeling performance. As a follow-on research effort, this study seeks to test the previously introduced algorithm’s stability and sensitivity, as well as present an innovative method for the extraction of localized and state-level variable importance information from the original dataset, using a nontraditional method known as synthetic population generation. Through iterative synthetic population generation with similar underlying statistical properties to the original dataset and exploration of the distribution of health insurance coverage across the state of Missouri, we identified variables that contributed to decisions for clustering, variables that contributed most significantly to modeling health insurance distribution status throughout the state, and variables that were most influential in optimizing model performance, having the greatest impact on change-in-mean-squared-error (MSE) measurements. Results suggest that cluster-based preprocessing approaches for machine learning algorithms can result in significantly increased performance, and also demonstrate how synthetic populations can be used for performance measurement to identify and test the extent to which variable statistical properties within a dataset can vary without resulting in significant performance loss.
Keywords: Synthetic population generation; Ensemble modeling; Machine learning; Model validation; Stability testing; Health disparities; Health insurance (search for similar items in EconPapers)
Date: 2019
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s42001-019-00047-7 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:jcsosc:v:2:y:2019:i:2:d:10.1007_s42001-019-00047-7
Ordering information: This journal article can be ordered from
http://www.springer. ... iences/journal/42001
DOI: 10.1007/s42001-019-00047-7
Access Statistics for this article
Journal of Computational Social Science is currently edited by Takashi Kamihigashi
More articles in Journal of Computational Social Science from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().