Bounded Regret for Finitely Parameterized Multi-Armed Bandits
Kishan Panaganti (),
Dileep Kalathil () and
Pravin Varaiya ()
Additional contact information
Kishan Panaganti: Texas A&M University, Department of Electrical and Computer Engineering
Dileep Kalathil: Texas A&M University, Department of Electrical and Computer Engineering
Pravin Varaiya: University of California, Department of Electrical Engineering and Computer Sciences
A chapter in Stochastic Analysis, Filtering, and Stochastic Optimization, 2022, pp 411-429 from Springer
Abstract:
Abstract We consider multi-armed bandits where the model of the underlying stochastic environment is characterized by a common unknown parameter. The true parameter is unknown to the learning agent. However, the set of possible parameters, which is finite, is known a priori. We propose an algorithm that is simple and easy to implement, which we call Finitely Parameterized Upper Confidence Bound (FP-UCB) algorithm, which uses the information about the underlying parameter set for faster learning. In particular, we show that the FP-UCB algorithm achieves a bounded regret under a structural condition on the underlying parameter set.We also show that, if the underlying parameter set does not satisfy this structural condition, the FP-UCB algorithm achieves a logarithmic regret, but with a smaller preceding constant compared to the standard UCB algorithm. We also validate the superior performance of the FP-UCB algorithm through extensive numerical simulations.
Date: 2022
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:sprchp:978-3-030-98519-6_17
Ordering information: This item can be ordered from
http://www.springer.com/9783030985196
DOI: 10.1007/978-3-030-98519-6_17
Access Statistics for this chapter
More chapters in Springer Books from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().