EconPapers    
Economics at your fingertips  
 

Valid Post-Contextual Bandit Inference

Ramon van den Akker, Bas J. M. Werker and Bo Zhou

Papers from arXiv.org

Abstract: We establish an asymptotic framework for the statistical analysis of the stochastic contextual multi-armed bandit problem (CMAB), which is widely employed in adaptively randomized experiments across various fields. While algorithms for maximizing rewards or, equivalently, minimizing regret have received considerable attention, our focus centers on statistical inference with adaptively collected data under the CMAB model. To this end we derive the limit experiment (in the Hajek-Le Cam sense). This limit experiment is highly nonstandard and, applying Girsanov's theorem, we obtain a structural representation in terms of stochastic differential equations. This structural representation, and a general weak convergence result we develop, allow us to obtain the asymptotic distribution of statistics for the CMAB problem. In particular, we obtain the asymptotic distributions for the classical t-test (non-Gaussian), Adaptively Weighted tests, and Inverse Propensity Weighted tests (non-Gaussian). We show that, when comparing both arms, validity of these tests requires the sampling scheme to be translation invariant in a way we make precise. We propose translation-invariant versions of Thompson, tempered greedy, and tempered Upper Confidence Bound sampling. Simulation results corroborate our asymptotic analysis.

Date: 2025-05
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://arxiv.org/pdf/2505.13897 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2505.13897

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().

 
Page updated 2025-06-14
Handle: RePEc:arx:papers:2505.13897