A Bayesian two‐armed bandit model

Wang, Xikui; Liang, You; Porth, Lysa

A Bayesian two‐armed bandit model

Xikui Wang, You Liang and Lysa Porth

Applied Stochastic Models in Business and Industry, 2019, vol. 35, issue 3, 624-636

Abstract: A two‐armed bandit model using a Bayesian approach is formulated and investigated in this paper with the goal of maximizing the value of a certain criterion of optimality. The bandit model illustrates the trade‐off between exploration and exploitation, where exploration means acquiring scientific acknowledge for better‐informed decisions at later stages (ie, maximizing long‐term benefit), and exploitation means applying the current knowledge for the best possible outcome at the current stage (ie, maximizing the immediate expected payoff). When one arm has known characteristics, stochastic dynamic programming is applied to characterize the optimal strategy and provide the foundation for its calculation. The results show that the celebrated Gittins index can be approximated by a monotonic sequence of break‐even values. When both arms are unknown, we derive a special case of optimality of the myopic strategy.

Date: 2019
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1002/asmb.2355

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:wly:apsmbi:v:35:y:2019:i:3:p:624-636

Access Statistics for this article

More articles in Applied Stochastic Models in Business and Industry from John Wiley & Sons
Bibliographic data for series maintained by Wiley Content Delivery ().