Inference after discretizing unobserved heterogeneity
Jad Beyhum and
Martin Mugnier
Additional contact information
Jad Beyhum: KU Leuven - Catholic University of Leuven = Katholieke Universiteit Leuven
Martin Mugnier: PSE - Paris School of Economics - UP1 - Université Paris 1 Panthéon-Sorbonne - ENS-PSL - École normale supérieure - Paris - PSL - Université Paris Sciences et Lettres - EHESS - École des hautes études en sciences sociales - ENPC - École nationale des ponts et chaussées - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement, PJSE - Paris Jourdan Sciences Economiques - UP1 - Université Paris 1 Panthéon-Sorbonne - ENS-PSL - École normale supérieure - Paris - PSL - Université Paris Sciences et Lettres - EHESS - École des hautes études en sciences sociales - ENPC - École nationale des ponts et chaussées - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement
PSE Working Papers from HAL
Abstract:
We consider a linear panel data model with nonseparable two-way unobserved heterogeneity corresponding to a linear version of the model studied in Bonhomme et al. (2022). We show that inference is possible in this setting using a straightforward two-step estimation procedure inspired by existing discretization approaches. In the first step, we construct a discrete approximation of the unobserved heterogeneity by (k-means) clustering observations separately across the individual (i) and time (t) dimensions. In the second step, we estimate a linear model with two-way group fixed effects specific to each cluster. Our approach shares similarities with methods from the double machine learning literature, as the underlying moment conditions exhibit the same type of bias-reducing properties. We provide a theoretical analysis of a cross-fitted version of our estimator, establishing its asymptotic normality at parametric rate under the condition max(N, T ) = o(min(N, T ) 3 ). Simulation studies demonstrate that our methodology achieves excellent finite-sample performance, even when T is negligible with respect to N .
Keywords: Unobserved heterogeneity; K-means clustering; Panel data; Double machine learning; Inference (search for similar items in EconPapers)
Date: 2024-12
Note: View the original document on HAL open archive server: https://shs.hal.science/halshs-04840588v1
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://shs.hal.science/halshs-04840588v1/document (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hal:psewpa:halshs-04840588
Access Statistics for this paper
More papers in PSE Working Papers from HAL
Bibliographic data for series maintained by CCSD ().