Bandit and covariate processes, with finite or non-denumerable set of arms
Tze Leung Lai,
Michael Benjamin Sklar and
Huanzhong Xu
Stochastic Processes and their Applications, 2022, vol. 150, issue C, 1222-1237
Abstract:
We introduce herein a new approach to nonparametric multi-armed bandit theory involving both the bandit and the covariate processes. Following Berry et al. (1997), we assume a non-denumerable set of arms for the bandit process. The approach we develop herein can be readily extended to continuous-time processes by using ε-greedy randomization and arm elimination instead of dynamic allocation indices. It also carries out a stochastic search with O(1) expected time for a nearly optimal arm at covariate values in a given set B before applying ε-greedy randomization and arm elimination. The procedure is shown to attain the asymptotically minimal rates for the regret over B.
Keywords: Multi-armed bandits; Infinite set of arms; Covariates for personalization; Adaptive allocation and its regret (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0304414922000722
Full text for ScienceDirect subscribers only
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:spapps:v:150:y:2022:i:c:p:1222-1237
Ordering information: This journal article can be ordered from
http://http://www.elsevier.com/wps/find/supportfaq.cws_home/regional
https://shop.elsevie ... _01_ooc_1&version=01
DOI: 10.1016/j.spa.2022.03.010
Access Statistics for this article
Stochastic Processes and their Applications is currently edited by T. Mikosch
More articles in Stochastic Processes and their Applications from Elsevier
Bibliographic data for series maintained by Catherine Liu ().