EconPapers    
Economics at your fingertips  
 

Optimal Best Arm Identification in Two-Armed Bandits with a Fixed Budget under a Small Gap

Masahiro Kato, Kaito Ariu, Masaaki Imaizumi, Masahiro Nomura and Chao Qin

Papers from arXiv.org

Abstract: We consider fixed-budget best-arm identification in two-armed Gaussian bandit problems. One of the longstanding open questions is the existence of an optimal strategy under which the probability of misidentification matches a lower bound. We show that a strategy following the Neyman allocation rule (Neyman, 1934) is asymptotically optimal when the gap between the expected rewards is small. First, we review a lower bound derived by Kaufmann et al. (2016). Then, we propose the "Neyman Allocation (NA)-Augmented Inverse Probability weighting (AIPW)" strategy, which consists of the sampling rule using the Neyman allocation with an estimated standard deviation and the recommendation rule using an AIPW estimator. Our proposed strategy is optimal because the upper bound matches the lower bound when the budget goes to infinity and the gap goes to zero.

Date: 2022-01, Revised 2022-12
New Economics Papers: this item is included in nep-ecm
References: View complete reference list from CitEc
Citations: View citations in EconPapers (4)

Downloads: (external link)
http://arxiv.org/pdf/2201.04469 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2201.04469

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().

 
Page updated 2025-03-19
Handle: RePEc:arx:papers:2201.04469