EconPapers    
Economics at your fingertips  
 

A cross‐validation statistical framework for asymmetric data integration

Lam Tran, Kevin He, Di Wang and Hui Jiang

Biometrics, 2023, vol. 79, issue 2, 1280-1292

Abstract: The proliferation of biobanks and large public clinical data sets enables their integration with a smaller amount of locally gathered data for the purposes of parameter estimation and model prediction. However, public data sets may be subject to context‐dependent confounders and the protocols behind their generation are often opaque; naively integrating all external data sets equally can bias estimates and lead to spurious conclusions. Weighted data integration is a potential solution, but current methods still require subjective specifications of weights and can become computationally intractable. Under the assumption that local data are generated from the set of unknown true parameters, we propose a novel weighted integration method based upon using the external data to minimize the local data leave‐one‐out cross validation (LOOCV) error. We demonstrate how the optimization of LOOCV errors for linear and Cox proportional hazards models can be rewritten as functions of external data set integration weights. Significant reductions in estimation error and prediction error are shown using simulation studies mimicking the heterogeneity of clinical data as well as a real‐world example using kidney transplant patients from the Scientific Registry of Transplant Recipients.

Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1111/biom.13685

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:biomet:v:79:y:2023:i:2:p:1280-1292

Ordering information: This journal article can be ordered from
http://www.blackwell ... bs.asp?ref=0006-341X

Access Statistics for this article

More articles in Biometrics from The International Biometric Society
Bibliographic data for series maintained by Wiley Content Delivery ().

 
Page updated 2025-03-19
Handle: RePEc:bla:biomet:v:79:y:2023:i:2:p:1280-1292