EconPapers    
Economics at your fingertips  
 

Adapting to Misspecification in Contextual Bandits with Offline Regression Oracles

Sanath Kumar Krishnamurthy, Vitor Hadad and Susan Athey
Additional contact information
Sanath Kumar Krishnamurthy: Stanford University
Vitor Hadad: Stanford University

Research Papers from Stanford University, Graduate School of Business

Abstract: Computationally efficient contextual bandits are often based on estimating a predictive model of rewards given contexts and arms using past data. However, when the reward model is not well-specified, the bandit algorithm may incur unexpected regret, so recent work has focused on algorithms that are robust to misspecification. We propose a simple family of contextual bandit algorithms that adapt to misspecification error by reverting to a good safe policy when there is evidence that misspecification is causing a regret increase. Our algorithm requires only an offline regression oracle to ensure regret guarantees that gracefully degrade in terms of a measure of the average misspecification level. Compared to prior work, we attain similar regret guarantees, but we do not rely on a master algorithm, and do not require more robust oracles like online or constrained regression oracles [e.g., Foster et al. (2020a); Krishnamurthy et al. (2020)]. This allows us to design algorithms for more general function approximation classes.

Date: 2021-02
New Economics Papers: this item is included in nep-ecm
References: Add references at CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
https://www.gsb.stanford.edu/faculty-research/work ... s-offline-regression

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ecl:stabus:3951

Access Statistics for this paper

More papers in Research Papers from Stanford University, Graduate School of Business Contact information at EDIRC.
Bibliographic data for series maintained by ().

 
Page updated 2025-03-19
Handle: RePEc:ecl:stabus:3951