EconPapers    
Economics at your fingertips  
 

Comparing imputation approaches to handle systematically missing inputs in risk calculators

Anja Mühlemann, Philip Stange, Antoine Faul, Serena Lozza-Fiacco, Rowan Iskandar, Manuela Moraru, Susanne Theis, Petra Stute, Ben D Spycher and David Ginsbourger

PLOS Digital Health, 2025, vol. 4, issue 1, 1-26

Abstract: Risk calculators based on statistical and/or mechanistic models have flourished and are increasingly available for a variety of diseases. However, in the day-to-day practice, their usage may be hampered by missing input variables. Certain measurements needed to calculate disease risk may be difficult to acquire, e.g. because they necessitate blood draws, and may be systematically missing in the population of interest. We compare several deterministic and probabilistic imputation approaches to surrogate predictions from risk calculators while accounting for uncertainty due to systematically missing inputs. The considered approaches predict missing inputs from available ones. In the case of probabilistic imputation, this leads to probabilistic prediction of the risk. We compare the methods using scoring techniques for forecast evaluation, with a focus on the Brier and CRPS scores. We also discuss the classification of patients into risk groups defined by thresholding predicted probabilities. While the considered procedures are not meant to replace fully-informed risk calculations, employing them to get first indications of risk distribution in the absence of at least one input parameter may find useful applications in medical practice. To illustrate this, we use the SCORE2 risk calculator for cardiovascular disease and a data set including medical data from 359 women, obtained from the gynecology department at the Inselspital in Bern, Switzerland. Using this data set, we mimic the situation where some input parameters, blood lipids and blood pressure, are systematically missing and compute the SCORE2 risk by probabilistic imputation of the missing variables based on the remaining input variables. We compare this approach to established imputation techniques like MICE by means of scoring rules and visualize in turn how probabilistic imputation can be used in sample size considerations.Author summary: Risk calculators and more generally, computer codes, play an important part in digital health. Given patient information, they allow for instance getting estimates for probabilities of developing certain diseases. Yet when part of the required patient information is missing, e.g., because some of the risk factors could not be measured, performing risk calculations may require to imputate missing values. We compare different imputation approaches, and essentially make a case that using probabilistic imputation approaches is worth the effort compared to deterministic approaches. In essence, propagating uncertainties on the imputated risk factors leads to probabilistic predictors of risks. We illustrate on the considered risks of developing a cardiovascular disease for cohort of patients from a menopause clinic in Bern, Switzerland, how the considered probabilistic approaches outperform deterministic ones in terms of forecast evaluation scores, and how such probabilistic risk predictions may be used in medical practice, highlighting in turn arising trade-offs between type I and type II errors.

Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000712 (text/html)
https://journals.plos.org/digitalhealth/article/fi ... 00712&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pdig00:0000712

DOI: 10.1371/journal.pdig.0000712

Access Statistics for this article

More articles in PLOS Digital Health from Public Library of Science
Bibliographic data for series maintained by digitalhealth ().

 
Page updated 2025-05-31
Handle: RePEc:plo:pdig00:0000712