A two sample size estimator for large data sets
Martin O'Connell,
Howard Smith and
Oyvind Thomassen
No 17941, CEPR Discussion Papers from C.E.P.R. Discussion Papers
Abstract:
In GMM estimators moment conditions with additive error terms involve an observed component and a predicted component. If the predicted component is computationally costly to evaluate, it may not be feasible to estimate the model with all the available data. We propose an estimator that uses the full data set for the computationally cheap observed component, but a reduced sample size for the predicted component. We show consistency, asymptotic normality, and derive standard errors and a practical criterion for when our estimator is variance-reducing. We demonstrate the estimator's properties on a range of models through Monte Carlo studies and an empirical application to alcohol demand.
Keywords: Gmm; Estimation; Micro data (search for similar items in EconPapers)
JEL-codes: C20 C51 C55 (search for similar items in EconPapers)
Date: 2023-02
References: Add references at CitEc
Citations:
Downloads: (external link)
https://cepr.org/publications/DP17941 (application/pdf)
CEPR Discussion Papers are free to download for our researchers, subscribers and members. If you fall into one of these categories but have trouble downloading our papers, please contact us at subscribers@cepr.org
Related works:
Working Paper: A two sample size estimator for large data sets (2023) 
Working Paper: A two sample size estimator for large data sets (2023) 
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:cpr:ceprdp:17941
Ordering information: This working paper can be ordered from
https://cepr.org/publications/DP17941
Access Statistics for this paper
More papers in CEPR Discussion Papers from C.E.P.R. Discussion Papers Centre for Economic Policy Research, 33 Great Sutton Street, London EC1V 0DX.
Bibliographic data for series maintained by ().