EconPapers    
Economics at your fingertips  
 

Minimax lower bounds for the two-armed bandit problem

Sanjeev R. Kulkarni and Gabor Lugosi

Economics Working Papers from Department of Economics and Business, Universitat Pompeu Fabra

Abstract: We obtain minimax lower bounds on the regret for the classical two--armed bandit problem. We provide a finite--sample minimax version of the well--known log $n$ asymptotic lower bound of Lai and Robbins. Also, in contrast to the log $n$ asymptotic results on the regret, we show that the minimax regret is achieved by mere random guessing under fairly mild conditions on the set of allowable configurations of the two arms. That is, we show that for {\sl every} allocation rule and for {\sl every} $n$, there is a configuration such that the regret at time $n$ is at least 1 -- $\epsilon$ times the regret of random guessing, where $\epsilon$ is any small positive constant.

Keywords: Bandit problem; minimax lower bounds (search for similar items in EconPapers)
JEL-codes: C12 C73 (search for similar items in EconPapers)
Date: 1997-02
New Economics Papers: this item is included in nep-gth
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://econ-papers.upf.edu/papers/206.pdf Whole Paper (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:upf:upfgen:206

Access Statistics for this paper

More papers in Economics Working Papers from Department of Economics and Business, Universitat Pompeu Fabra
Bibliographic data for series maintained by ( this e-mail address is bad, please contact ).

 
Page updated 2025-04-01
Handle: RePEc:upf:upfgen:206