EconPapers    
Economics at your fingertips  
 

Bounded Regret for Finitely Parameterized Multi-Armed Bandits

Kishan Panaganti (), Dileep Kalathil () and Pravin Varaiya ()
Additional contact information
Kishan Panaganti: Texas A&M University, Department of Electrical and Computer Engineering
Dileep Kalathil: Texas A&M University, Department of Electrical and Computer Engineering
Pravin Varaiya: University of California, Department of Electrical Engineering and Computer Sciences

A chapter in Stochastic Analysis, Filtering, and Stochastic Optimization, 2022, pp 411-429 from Springer

Abstract: Abstract We consider multi-armed bandits where the model of the underlying stochastic environment is characterized by a common unknown parameter. The true parameter is unknown to the learning agent. However, the set of possible parameters, which is finite, is known a priori. We propose an algorithm that is simple and easy to implement, which we call Finitely Parameterized Upper Confidence Bound (FP-UCB) algorithm, which uses the information about the underlying parameter set for faster learning. In particular, we show that the FP-UCB algorithm achieves a bounded regret under a structural condition on the underlying parameter set.We also show that, if the underlying parameter set does not satisfy this structural condition, the FP-UCB algorithm achieves a logarithmic regret, but with a smaller preceding constant compared to the standard UCB algorithm. We also validate the superior performance of the FP-UCB algorithm through extensive numerical simulations.

Date: 2022
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:sprchp:978-3-030-98519-6_17

Ordering information: This item can be ordered from
http://www.springer.com/9783030985196

DOI: 10.1007/978-3-030-98519-6_17

Access Statistics for this chapter

More chapters in Springer Books from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2026-05-22
Handle: RePEc:spr:sprchp:978-3-030-98519-6_17