EconPapers    
Economics at your fingertips  
 

Robust control of the multi-armed bandit problem

Felipe Caro () and Aparupa Das Gupta ()
Additional contact information
Felipe Caro: UCLA Anderson School of Management
Aparupa Das Gupta: UCLA Anderson School of Management

Annals of Operations Research, 2022, vol. 317, issue 2, No 7, 480 pages

Abstract: Abstract We study a robust model of the multi-armed bandit (MAB) problem in which the transition probabilities are ambiguous and belong to subsets of the probability simplex. We first show that for each arm there exists a robust counterpart of the Gittins index that is the solution to a robust optimal stopping-time problem and can be computed effectively with an equivalent restart problem. We then characterize the optimal policy of the robust MAB as a project-by-project retirement policy but we show that arms become dependent so the policy based on the robust Gittins index is not optimal. For a project selection problem, we show that the robust Gittins index policy is near optimal but its implementation requires more computational effort than solving a non-robust MAB problem. Hence, we propose a Lagrangian index policy that requires the same computational effort as evaluating the indices of a non-robust MAB and is within 1 % of the optimum in the robust project selection problem.

Keywords: Multiarmed bandit; Index policies; Bellman equation; Robust Markov decision processes; Uncertain transition matrix; Project selection (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s10479-015-1965-7 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:annopr:v:317:y:2022:i:2:d:10.1007_s10479-015-1965-7

Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10479

DOI: 10.1007/s10479-015-1965-7

Access Statistics for this article

Annals of Operations Research is currently edited by Endre Boros

More articles in Annals of Operations Research from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:annopr:v:317:y:2022:i:2:d:10.1007_s10479-015-1965-7