Randomized allocation with nonparametric estimation for contextual multi-armed bandits with delayed rewards

Arya, Sakshi; Yang, Yuhong

Randomized allocation with nonparametric estimation for contextual multi-armed bandits with delayed rewards

Sakshi Arya and Yuhong Yang

Statistics & Probability Letters, 2020, vol. 164, issue C

Abstract: We study a multi-armed bandit problem with covariates in a setting where there is a possible delay in observing the rewards. Under some reasonable assumptions on the probability distributions for the delays and using an appropriate randomization to select the arms, the proposed strategy is shown to be strongly consistent.

Keywords: Multi-armed bandit with covariates; Delayed rewards; Histogram method; Strong consistency (search for similar items in EconPapers)
Date: 2020
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167715220301218
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:stapro:v:164:y:2020:i:c:s0167715220301218

Ordering information: This journal article can be ordered from
http://www.elsevier.com/wps/find/supportfaq.cws_home/regional
https://shop.elsevie ... _01_ooc_1&version=01

DOI: 10.1016/j.spl.2020.108818

Access Statistics for this article

Statistics & Probability Letters is currently edited by Somnath Datta and Hira L. Koul

More articles in Statistics & Probability Letters from Elsevier
Bibliographic data for series maintained by Catherine Liu ().