Online Instrumental Variable Regression: Regret Analysis and Bandit Feedback
Riccardo Della Vecchia () and
Debabrota Basu ()
Additional contact information
Riccardo Della Vecchia: Scool - Scool - Centre Inria de l'Université de Lille - Inria - Institut National de Recherche en Informatique et en Automatique - CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 - Centrale Lille - Université de Lille - CNRS - Centre National de la Recherche Scientifique
Debabrota Basu: Scool - Scool - Centre Inria de l'Université de Lille - Inria - Institut National de Recherche en Informatique et en Automatique - CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 - Centrale Lille - Université de Lille - CNRS - Centre National de la Recherche Scientifique
Working Papers from HAL
Abstract:
The independence of noise and covariates is a standard assumption in online linear regression with unbounded noise and linear bandit literature. This assumption and the following analysis are invalid in the case of endogeneity, i.e., when the noise and covariates are correlated. In this paper, we study the online setting of Instrumental Variable (IV) regression, which is widely used in economics to identify the underlying model from an endogenous dataset. Specifically, we upper bound the identification and oracle regrets of the popular Two-Stage Least Squares (2SLS) approach to IV regression but in the online setting. Our analysis shows that Online 2SLS (O2SLS) achieves $\mathcal O(d^2\log^2 T)$ identification and $\mathcal O(\gamma \sqrt{d T \log T})$ oracle regret after $T$ interactions, where $d$ is the dimension of covariates and $\gamma$ is the bias due to endogeneity. Then, we leverage O2SLS as an oracle to design OFUL-IV, a linear bandit algorithm. OFUL-IV can tackle endogeneity and achieves $\mathcal O(d\sqrt{T}\log T)$ regret. For datasets with endogeneity, we experimentally show the efficiency of OFUL-IV in terms of estimation error and regret.
Keywords: Causality; Instrumental Variables; Online linear regression; Online learning; Bandit / imperfect feedback; Linear bandits; Regret Bounds; Econometrics; Two-stage regression (search for similar items in EconPapers)
Date: 2023-02-20
New Economics Papers: this item is included in nep-ecm
Note: View the original document on HAL open archive server: https://hal.science/hal-03831210v2
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://hal.science/hal-03831210v2/document (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hal:wpaper:hal-03831210
Access Statistics for this paper
More papers in Working Papers from HAL
Bibliographic data for series maintained by CCSD ().