Experiment-selector cross-validated targeted maximum likelihood estimator for hybrid RCT-external data studies
Dang Lauren Eyler (),
Tarp Jens Magelund (),
Abrahamsen Trine Julie (),
Kvist Kajsa (),
Buse John B. (),
Petersen Maya () and
Mark van der Laan ()
Additional contact information
Dang Lauren Eyler: Biostatistics Research Branch, National Institute of Allergy and Infectious Diseases, Rockville, MD, 20852, United States of America
Tarp Jens Magelund: Novo Nordisk, Søborg, Denmark
Abrahamsen Trine Julie: Novo Nordisk, Søborg, Denmark
Kvist Kajsa: Novo Nordisk, Søborg, Denmark
Buse John B.: Division of Endocrinology, Department of Medicine, University of North Carolina, Chapel Hill, NC, 27516, United States of America
Petersen Maya: Department of Biostatistics, University of California, Berkeley, CA, 94720, United States of America
Mark van der Laan: Department of Biostatistics, University of California, Berkeley, CA, 94720, United States of America
Journal of Causal Inference, 2025, vol. 13, issue 1, 33
Abstract:
Augmenting a randomized controlled trial (RCT) with external data may increase power at the risk of introducing bias. To select and analyze the experiment (RCT alone or combined with external data) with the optimal bias-variance tradeoff, we develop a novel experiment-selector cross-validated targeted maximum likelihood estimator for randomized-external data studies (ES-CVTMLE). This estimator utilizes two estimates of bias to determine whether to integrate external data based on (1) a function of the difference in conditional mean outcome under control between the RCT and combined experiments and (2) an estimate of the average treatment effect on a negative control outcome. We define the asymptotic distribution of the ES-CVTMLE under varying magnitudes of bias and construct confidence intervals by Monte Carlo simulation. We evaluate ES-CVTMLE compared to three other data fusion estimators in simulations and demonstrate the ability of ES-CVTMLE to distinguish biased from unbiased external controls in a real data analysis of the effect of liraglutide on glycemic control from the LEADER trial. The ES-CVTMLE has the potential to improve power while providing relatively robust inference for future hybrid RCT-external data studies.
Keywords: causal inference; clinical trials; data fusion; negative control outcomes; real world data (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1515/jci-2024-0041 (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bpj:causin:v:13:y:2025:i:1:p:33:n:1001
DOI: 10.1515/jci-2024-0041
Access Statistics for this article
Journal of Causal Inference is currently edited by Elias Bareinboim, Jin Tian and Iván Díaz
More articles in Journal of Causal Inference from De Gruyter
Bibliographic data for series maintained by Peter Golla ().