Consequences of Model Misspecification for Maximum Likelihood Estimation with Missing Data

Golden, Richard M.; Henley, Steven S.; White, Halbert; Kashner, T. Michael

Consequences of Model Misspecification for Maximum Likelihood Estimation with Missing Data

Richard M. Golden, Steven S. Henley, Halbert White and T. Michael Kashner
Additional contact information
Richard M. Golden: School of Behavioral and Brain Sciences, GR4.1, 800 W. Campbell Rd., University of Texas at Dallas, Richardson, TX 75080, USA
Steven S. Henley: Martingale Research Corporation, 101 E. Park Blvd., Suite 600, Plano, TX 75074, USA
T. Michael Kashner: Department of Medicine, Loma Linda University School of Medicine, Loma Linda, CA 92357, USA

Econometrics, 2019, vol. 7, issue 3, 1-27

Abstract: Researchers are often faced with the challenge of developing statistical models with incomplete data. Exacerbating this situation is the possibility that either the researcher’s complete-data model or the model of the missing-data mechanism is misspecified. In this article, we create a formal theoretical framework for developing statistical models and detecting model misspecification in the presence of incomplete data where maximum likelihood estimates are obtained by maximizing the observable-data likelihood function when the missing-data mechanism is assumed ignorable. First, we provide sufficient regularity conditions on the researcher’s complete-data model to characterize the asymptotic behavior of maximum likelihood estimates in the simultaneous presence of both missing data and model misspecification. These results are then used to derive robust hypothesis testing methods for possibly misspecified models in the presence of Missing at Random (MAR) or Missing Not at Random (MNAR) missing data. Second, we introduce a method for the detection of model misspecification in missing data problems using recently developed Generalized Information Matrix Tests (GIMT). Third, we identify regularity conditions for the Missing Information Principle (MIP) to hold in the presence of model misspecification so as to provide useful computational covariance matrix estimation formulas. Fourth, we provide regularity conditions that ensure the observable-data expected negative log-likelihood function is convex in the presence of partially observable data when the amount of missingness is sufficiently small and the complete-data likelihood is convex. Fifth, we show that when the researcher has correctly specified a complete-data model with a convex negative likelihood function and an ignorable missing-data mechanism, then its strict local minimizer is the true parameter value for the complete-data model when the amount of missingness is sufficiently small. Our results thus provide new robust estimation, inference, and specification analysis methods for developing statistical models with incomplete data.

Keywords: asymptotic theory; ignorable; Generalized Information Matrix Test; misspecification; missing data; nonignorable; sandwich estimator; specification analysis (search for similar items in EconPapers)
JEL-codes: B23 C C00 C01 C1 C2 C3 C4 C5 C8 (search for similar items in EconPapers)
Date: 2019
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (3)

Downloads: (external link)
https://www.mdpi.com/2225-1146/7/3/37/pdf (application/pdf)
https://www.mdpi.com/2225-1146/7/3/37/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jecnmx:v:7:y:2019:i:3:p:37-:d:264548

Access Statistics for this article

Econometrics is currently edited by Ms. Jasmine Liu

More articles in Econometrics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().