EconPapers    
Economics at your fingertips  
 

Best-of-Both-Worlds Linear Contextual Bandits

Masahiro Kato and Shinji Ito

Papers from arXiv.org

Abstract: This study investigates the problem of $K$-armed linear contextual bandits, an instance of the multi-armed bandit problem, under an adversarial corruption. At each round, a decision-maker observes an independent and identically distributed context and then selects an arm based on the context and past observations. After selecting an arm, the decision-maker incurs a loss corresponding to the selected arm. The decision-maker aims to minimize the cumulative loss over the trial. The goal of this study is to develop a strategy that is effective in both stochastic and adversarial environments, with theoretical guarantees. We first formulate the problem by introducing a novel setting of bandits with adversarial corruption, referred to as the contextual adversarial regime with a self-bounding constraint. We assume linear models for the relationship between the loss and the context. Then, we propose a strategy that extends the RealLinExp3 by Neu & Olkhovskaya (2020) and the Follow-The-Regularized-Leader (FTRL). The regret of our proposed algorithm is shown to be upper-bounded by $O\left(\min\left\{\frac{(\log(T))^3}{\Delta_{*}} + \sqrt{\frac{C(\log(T))^3}{\Delta_{*}}},\ \ \sqrt{T}(\log(T))^2\right\}\right)$, where $T \in\mathbb{N}$ is the number of rounds, $\Delta_{*} > 0$ is the constant minimum gap between the best and suboptimal arms for any context, and $C\in[0, T] $ is an adversarial corruption parameter. This regret upper bound implies $O\left(\frac{(\log(T))^3}{\Delta_{*}}\right)$ in a stochastic environment and by $O\left( \sqrt{T}(\log(T))^2\right)$ in an adversarial environment. We refer to our strategy as the Best-of-Both-Worlds (BoBW) RealFTRL, due to its theoretical guarantees in both stochastic and adversarial regimes.

Date: 2023-12
New Economics Papers: this item is included in nep-inv
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://arxiv.org/pdf/2312.16489 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2312.16489

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().

 
Page updated 2025-03-19
Handle: RePEc:arx:papers:2312.16489