Second-Order Inference for the Mean of a Variable Missing at Random
Díaz Iván (),
Carone Marco and
J. van der Laan Mark
Additional contact information
Díaz Iván: Google Inc. New York, NY
Carone Marco: Department of Biostatistics, University of Washington, Seattle, WA, USA
J. van der Laan Mark: Division of Biostatistics, University of California, Berkeley, CA, USA
The International Journal of Biostatistics, 2016, vol. 12, issue 1, 333-349
Abstract:
We present a second-order estimator of the mean of a variable subject to missingness, under the missing at random assumption. The estimator improves upon existing methods by using an approximate second-order expansion of the parameter functional, in addition to the first-order expansion employed by standard doubly robust methods. This results in weaker assumptions about the convergence rates necessary to establish consistency, local efficiency, and asymptotic linearity. The general estimation strategy is developed under the targeted minimum loss-based estimation (TMLE) framework. We present a simulation comparing the sensitivity of the first and second-order estimators to the convergence rate of the initial estimators of the outcome regression and missingness score. In our simulation, the second-order TMLE always had a coverage probability equal or closer to the nominal value 0.95, compared to its first-order counterpart. In the best-case scenario, the proposed second-order TMLE had a coverage probability of 0.86 when the first-order TMLE had a coverage probability of zero. We also present a novel first-order estimator inspired by a second-order expansion of the parameter functional. This estimator only requires one-dimensional smoothing, whereas implementation of the second-order TMLE generally requires kernel smoothing on the covariate space. The first-order estimator proposed is expected to have improved finite sample performance compared to existing first-order estimators. In the best-case scenario of our simulation study, the novel first-order TMLE improved the coverage probability from 0 to 0.90. We provide an illustration of our methods using a publicly available dataset to determine the effect of an anticoagulant on health outcomes of patients undergoing percutaneous coronary intervention. We provide R code implementing the proposed estimator.
Keywords: asymptotic linearity; targeted maximum likelihood estimation (TMLE); higher-order influence curve; efficient influence curve; missing at random (search for similar items in EconPapers)
Date: 2016
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1515/ijb-2015-0031 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bpj:ijbist:v:12:y:2016:i:1:p:333-349:n:14
Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/ijb/html
DOI: 10.1515/ijb-2015-0031
Access Statistics for this article
The International Journal of Biostatistics is currently edited by Antoine Chambaz, Alan E. Hubbard and Mark J. van der Laan
More articles in The International Journal of Biostatistics from De Gruyter
Bibliographic data for series maintained by Peter Golla ().