EconPapers    
Economics at your fingertips  
 

Multi-armed bandits based on a variant of Simulated Annealing

Mohammed Shahid Abdulla () and Shalabh Bhatnagar ()
Additional contact information
Mohammed Shahid Abdulla: Indian Institute of Management
Shalabh Bhatnagar: Indian Institute of Science

Indian Journal of Pure and Applied Mathematics, 2016, vol. 47, issue 2, 195-212

Abstract: Abstract A variant of Simulated Annealing termed Simulated Annealing with Multiplicative Weights (SAMW) has been proposed in the literature. However, convergence was dependent on a parameter β(T), which was calculated a-priori based on the total iterations T the algorithm would run for. We first show the convergence of SAMW even when a diminishing stepsize β k → 1 is used, where k is the index of iteration. Using this SAMW as a kernel, a stochastic multi-armed bandit (SMAB) algorithm called SOFTMIX can be improved to obtain the minimum-possible log regret, as compared to log2 regret of the original. Another modification of SOFTMIX is proposed which avoids the need for a parameter that is dependent on the reward distribution of the arms. Further, a variant of SOFTMIX that uses a comparison term drawn from another popular SMAB algorithm called UCB1 is then described. It is also shown why the proposed scheme is computationally more efficient over UCB1, and an alternative to this algorithm with simpler stepsizes is also proposed. Numerical simulations for all the proposed algorithms are then presented.

Keywords: Stochastic processes; applied probability; statistics; discrete optimization (search for similar items in EconPapers)
Date: 2016
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s13226-016-0184-5 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:indpam:v:47:y:2016:i:2:d:10.1007_s13226-016-0184-5

Ordering information: This journal article can be ordered from
https://www.springer.com/journal/13226

DOI: 10.1007/s13226-016-0184-5

Access Statistics for this article

Indian Journal of Pure and Applied Mathematics is currently edited by Nidhi Chandhoke

More articles in Indian Journal of Pure and Applied Mathematics from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:indpam:v:47:y:2016:i:2:d:10.1007_s13226-016-0184-5