Technical Note—A Note on the Equivalence of Upper Confidence Bounds and Gittins Indices for Patient Agents
Daniel Russo ()
Additional contact information
Daniel Russo: Graduate School of Business, Columbia University, New York, New York 10027
Operations Research, 2021, vol. 69, issue 1, 273-278
Abstract:
This note gives a short, self-contained proof of a sharp connection between Gittins indices and Bayesian upper confidence bound algorithms. I consider a Gaussian multiarmed bandit problem with discount factor γ. The Gittins index of an arm is shown to equal the γ-quantile of the posterior distribution of the arm’s mean plus an error term that vanishes as γ → 1. In this sense, for sufficiently patient agents, a Gittins index measures the highest plausible mean-reward of an arm in a manner equivalent to an upper confidence bound.
Keywords: Gittins index; upper confidence bound; multiarmed bandits (search for similar items in EconPapers)
Date: 2021
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1287/opre.2020.1987 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:69:y:2021:i:1:p:273-278
Access Statistics for this article
More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().