EconPapers    
Economics at your fingertips  
 

Healthcare Expenditure Prediction with Neighbourhood Variables – A Random Forest Model

Mohnen Sigrid M. (), Rotteveel Adriënne H. (), Doornbos Gerda () and Polder Johan J. ()
Additional contact information
Mohnen Sigrid M.: National Institute for Public Health and the Environment (RIVM), Centre for Nutrition, Prevention, and Health Services, Bilthoven, the Netherlands
Rotteveel Adriënne H.: National Institute for Public Health and the Environment (RIVM), Centre for Nutrition, Prevention, and Health Services, Bilthoven, the Netherlands
Doornbos Gerda: National Institute for Public Health and the Environment (RIVM), Centre for Nutrition, Prevention, and Health Services, Bilthoven, the Netherlands
Polder Johan J.: National Institute for Public Health and the Environment (RIVM), Centre for Health and Society, Bilthoven, the Netherlands

Statistics, Politics and Policy, 2020, vol. 11, issue 2, 111-138

Abstract: We investigated the additional predictive value of an individual’s neighbourhood (quality and location), and of changes therein on his/her healthcare costs. To this end, we combined several Dutch nationwide data sources from 2003 to 2014, and selected inhabitants who moved in 2010. We used random forest models to predict the area under the curve of the regular healthcare costs of individuals in the years 2011–2014. In our analyses, the quality of the neighbourhood before the move appeared to be quite important in predicting healthcare costs (i.e. importance rank 11 out of 126 socio-demographic and neighbourhood variables; rank 73 out of 261 in the full model with prior expenditure and medication). The predictive performance of the models was evaluated in terms of R 2 (or proportion of explained variance) and MAE (mean absolute (prediction) error). The model containing only socio-demographic information improved marginally when neighbourhood was added (R 2 +0.8%, MAE −€5). The full model remained the same for the study population (R 2 = 48.8%, MAE of €1556) and for subpopulations. These results indicate that only in prediction models in which prior expenditure and utilization cannot or ought not to be used neighbourhood might be an interesting source of information to improve predictive performance.

Keywords: importance ranks; healthcare costs; risk adjustment; machine learning; demand effect (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1515/spp-2019-0010 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bpj:statpp:v:11:y:2020:i:2:p:111-138:n:2

Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/spp/html

DOI: 10.1515/spp-2019-0010

Access Statistics for this article

Statistics, Politics and Policy is currently edited by Joel A. Middleton

More articles in Statistics, Politics and Policy from De Gruyter
Bibliographic data for series maintained by Peter Golla ().

 
Page updated 2025-03-19
Handle: RePEc:bpj:statpp:v:11:y:2020:i:2:p:111-138:n:2