Bandit bounds from stochastic variability extrema

Herschkorn, Stephen J.

Bandit bounds from stochastic variability extrema

Stephen J. Herschkorn

Statistics & Probability Letters, 1997, vol. 35, issue 3, 283-288

Abstract: In the consideration of bandit problems with general rewards and discount sequences, we compare an arm to one whose reward distribution may be one of two degenerate distributions. For the general multi-armed case, the latter problem provides an upper bound on the optimal return. In the case of two arms with the second known and regular discounting, consideration of the two-point distribution provides a sufficient condition for stopping. We interpret these results in the context of the value of information. The results, and others in the literature, suggest that bandit thresholds (or indices) may be monotonic with respect to ordering of distributions in the convex sense.

Keywords: Bandit; problems; stochastic; variability; ordering; convex; ordering; value; of; information (search for similar items in EconPapers)
Date: 1997
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167-7152(97)00024-2
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:stapro:v:35:y:1997:i:3:p:283-288

Ordering information: This journal article can be ordered from
http://www.elsevier.com/wps/find/supportfaq.cws_home/regional
https://shop.elsevie ... _01_ooc_1&version=01

Access Statistics for this article

Statistics & Probability Letters is currently edited by Somnath Datta and Hira L. Koul

More articles in Statistics & Probability Letters from Elsevier
Bibliographic data for series maintained by Catherine Liu ().