Stochastic Online Instrumental Variable Regression: Regrets for Endogeneity and Bandit Feedback
Riccardo Della Vecchia () and
Debabrota Basu ()
Additional contact information
Riccardo Della Vecchia: Scool - Scool - Centre Inria de l'Université de Lille - Inria - Institut National de Recherche en Informatique et en Automatique - CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 - Centrale Lille - Université de Lille - CNRS - Centre National de la Recherche Scientifique
Debabrota Basu: Scool - Scool - Centre Inria de l'Université de Lille - Inria - Institut National de Recherche en Informatique et en Automatique - CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 - Centrale Lille - Université de Lille - CNRS - Centre National de la Recherche Scientifique, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 - Centrale Lille - Université de Lille - CNRS - Centre National de la Recherche Scientifique, Centre Inria de l'Université de Lille - Inria - Institut National de Recherche en Informatique et en Automatique, Université de Lille, Centrale Lille
Post-Print from HAL
Abstract:
The independence of noise and covariates is a standard assumption in online linear regression with unbounded noise and linear bandit literature. This assumption and the following analysis are invalid in the case of endogeneity, i.e., when the noise and covariates are correlated. In this paper, we study the online setting of Instrumental Variable (IV) regression, which is widely used in economics to identify the underlying model from an endogenous dataset. Specifically, we upper bound the identification and oracle regrets of the popular Two-Stage Least Squares (2SLS) approach to IV regression but in the online setting. Our analysis shows that Online 2SLS (O2SLS) achieves $\mathcal O(d^2\log^2 T)$ identification and $\mathcal O(\gamma \sqrt{d T \log T})$ oracle regret after $T$ interactions, where $d$ is the dimension of covariates and $\gamma$ is the bias due to endogeneity. Then, we leverage O2SLS as an oracle to design OFUL-IV, a linear bandit algorithm. OFUL-IV can tackle endogeneity and achieves $\mathcal O(d\sqrt{T}\log T)$ regret. For different datasets with endogeneity, we experimentally show efficiencies of O2SLS and OFUL-IV.
Keywords: Causality; Instrumental Variables; Online linear regression; Online learning; Bandit / imperfect feedback; Linear bandits; Regret Bounds; Econometrics; Two-stage regression (search for similar items in EconPapers)
Date: 2025-02
Note: View the original document on HAL open archive server: https://hal.science/hal-03831210v2
References: Add references at CitEc
Citations:
Published in AAAI Conference on Artificial Intelligence, Feb 2025, Philadelphia, United States
Downloads: (external link)
https://hal.science/hal-03831210v2/document (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hal:journl:hal-03831210
Access Statistics for this paper
More papers in Post-Print from HAL
Bibliographic data for series maintained by CCSD ().