The Knowledge Gradient Algorithm for a General Class of Online Learning Problems

Ryzhov, Ilya O.; Powell, Warren B.; Frazier, Peter I.

The Knowledge Gradient Algorithm for a General Class of Online Learning Problems

Ilya O. Ryzhov (), Warren B. Powell () and Peter I. Frazier ()
Additional contact information
Ilya O. Ryzhov: Robert H. Smith School of Business, University of Maryland, College Park, Maryland 20742
Warren B. Powell: Department of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08544
Peter I. Frazier: Department of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853

Operations Research, 2012, vol. 60, issue 1, 180-195

Abstract: We derive a one-period look-ahead policy for finite- and infinite-horizon online optimal learning problems with Gaussian rewards. Our approach is able to handle the case where our prior beliefs about the rewards are correlated, which is not handled by traditional multiarmed bandit methods. Experiments show that our KG policy performs competitively against the best-known approximation to the optimal policy in the classic bandit problem, and it outperforms many learning policies in the correlated case.

Keywords: multiarmed bandit; optimal learning; online learning; knowledge gradient; Gittins index; index policy (search for similar items in EconPapers)
Date: 2012
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (23)

Downloads: (external link)
http://dx.doi.org/10.1287/opre.1110.0999 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:60:y:2012:i:1:p:180-195

Access Statistics for this article

More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().