EconPapers    
Economics at your fingertips  
 

Flexible and Efficient Contextual Bandits with Heterogeneous Treatment Effect Oracles

Aldo Gael Carranza, Sanath Kumar Krishnamurthy and Susan Athey
Additional contact information
Aldo Gael Carranza: Stanford U
Sanath Kumar Krishnamurthy: Stanford U

Research Papers from Stanford University, Graduate School of Business

Abstract: Contextual bandit algorithms often estimate reward models to inform decision-making. However, true rewards can contain action- independent redundancies that are not relevant for decision-making. We show it is more data- efficient to estimate any function that explains the reward differences between actions, that is, the treatment effects. Motivated by this obser- vation, building on recent work on oracle-based bandit algorithms, we provide the first reduction of contextual bandits to general-purpose hetero- geneous treatment effect estimation, and we de- sign a simple and computationally efficient algo- rithm based on this reduction. Our theoretical and experimental results demonstrate that hetero- geneous treatment effect estimation in contextual bandits offers practical advantages over reward estimation, including more efficient model esti- mation and greater flexibility to model misspeci- fication.

Date: 2023-02
New Economics Papers: this item is included in nep-ecm and nep-exp
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.gsb.stanford.edu/faculty-research/work ... erogeneous-treatment

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ecl:stabus:4081

Access Statistics for this paper

More papers in Research Papers from Stanford University, Graduate School of Business Contact information at EDIRC.
Bibliographic data for series maintained by ().

 
Page updated 2025-03-30
Handle: RePEc:ecl:stabus:4081