EconPapers    
Economics at your fingertips  
 

A constrained maximum likelihood approach to developing well-calibrated models for predicting binary outcomes

Yaqi Cao, Weidong Ma, Ge Zhao, Anne Marie McCarthy and Jinbo Chen ()
Additional contact information
Yaqi Cao: Minzu University of China
Weidong Ma: Perelman School of Medicine, University of Pennsylvania
Ge Zhao: Portland State University
Anne Marie McCarthy: Perelman School of Medicine, University of Pennsylvania
Jinbo Chen: Perelman School of Medicine, University of Pennsylvania

Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, 2024, vol. 30, issue 3, No 6, 624-648

Abstract: Abstract The added value of candidate predictors for risk modeling is routinely evaluated by comparing the performance of models with or without including candidate predictors. Such comparison is most meaningful when the estimated risk by the two models are both unbiased in the target population. Very often data for candidate predictors are sourced from nonrepresentative convenience samples. Updating the base model using the study data without acknowledging the discrepancy between the underlying distribution of the study data and that in the target population can lead to biased risk estimates and therefore an unfair evaluation of candidate predictors. To address this issue assuming access to a well-calibrated base model, we propose a semiparametric method for model fitting that enforces good calibration. The central idea is to calibrate the fitted model against the base model by enforcing suitable constraints in maximizing the likelihood function. This approach enables unbiased assessment of model improvement offered by candidate predictors without requiring a representative sample from the target population, thus overcoming a significant practical challenge. We study theoretical properties for model parameter estimates, and demonstrate improvement in model calibration via extensive simulation studies. Finally, we apply the proposed method to data extracted from Penn Medicine Biobank to inform the added value of breast density for breast cancer risk assessment in the Caucasian woman population.

Keywords: Calibration; Constrained maximum likelihood estimation; Logistic regression; Risk prediction (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s10985-024-09628-9 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:lifeda:v:30:y:2024:i:3:d:10.1007_s10985-024-09628-9

Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10985

DOI: 10.1007/s10985-024-09628-9

Access Statistics for this article

Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data is currently edited by Mei-Ling Ting Lee

More articles in Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:lifeda:v:30:y:2024:i:3:d:10.1007_s10985-024-09628-9