EconPapers    
Economics at your fingertips  
 

Variable Selection for Confounder Control, Flexible Modeling and Collaborative Targeted Minimum Loss-Based Estimation in Causal Inference

Schnitzer Mireille E. (), Lok Judith J. and Gruber Susan
Additional contact information
Schnitzer Mireille E.: Faculté de pharmacie, Université de Montréal, Pavillon Jean-Coutu, 2940 ch de la Polytechnique, P.O. Box 6128, Station Centre-ville, Montreal, Quebec, Canada
Lok Judith J.: Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
Gruber Susan: Reagan-Udall Foundation for the FDA, Washington, DC, USA

The International Journal of Biostatistics, 2016, vol. 12, issue 1, 97-115

Abstract: This paper investigates the appropriateness of the integration of flexible propensity score modeling (nonparametric or machine learning approaches) in semiparametric models for the estimation of a causal quantity, such as the mean outcome under treatment. We begin with an overview of some of the issues involved in knowledge-based and statistical variable selection in causal inference and the potential pitfalls of automated selection based on the fit of the propensity score. Using a simple example, we directly show the consequences of adjusting for pure causes of the exposure when using inverse probability of treatment weighting (IPTW). Such variables are likely to be selected when using a naive approach to model selection for the propensity score. We describe how the method of Collaborative Targeted minimum loss-based estimation (C-TMLE; van der Laan and Gruber, 2010 [27]) capitalizes on the collaborative double robustness property of semiparametric efficient estimators to select covariates for the propensity score based on the error in the conditional outcome model. Finally, we compare several approaches to automated variable selection in low- and high-dimensional settings through a simulation study. From this simulation study, we conclude that using IPTW with flexible prediction for the propensity score can result in inferior estimation, while Targeted minimum loss-based estimation and C-TMLE may benefit from flexible prediction and remain robust to the presence of variables that are highly correlated with treatment. However, in our study, standard influence function-based methods for the variance underestimated the standard errors, resulting in poor coverage under certain data-generating scenarios.

Keywords: C-TMLE; IPTW; variable reduction (search for similar items in EconPapers)
Date: 2016
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1515/ijb-2015-0017 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bpj:ijbist:v:12:y:2016:i:1:p:97-115:n:9

Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/ijb/html

DOI: 10.1515/ijb-2015-0017

Access Statistics for this article

The International Journal of Biostatistics is currently edited by Antoine Chambaz, Alan E. Hubbard and Mark J. van der Laan

More articles in The International Journal of Biostatistics from De Gruyter
Bibliographic data for series maintained by Peter Golla ().

 
Page updated 2025-03-19
Handle: RePEc:bpj:ijbist:v:12:y:2016:i:1:p:97-115:n:9