Multi-Gear Bandits, Partial Conservation Laws, and Indexability
José Niño-Mora
Additional contact information
José Niño-Mora: Department of Statistics, Carlos III University of Madrid, 28903 Getafe, Spain
Mathematics, 2022, vol. 10, issue 14, 1-31
Abstract:
This paper considers what we propose to call multi-gear bandits , which are Markov decision processes modeling a generic dynamic and stochastic project fueled by a single resource and which admit multiple actions representing gears of operation naturally ordered by their increasing resource consumption. The optimal operation of a multi-gear bandit aims to strike a balance between project performance costs or rewards and resource usage costs, which depend on the resource price. A computationally convenient and intuitive optimal solution is available when such a model is indexable , meaning that its optimal policies are characterized by a dynamic allocation index (DAI), a function of state–action pairs representing critical resource prices. Motivated by the lack of general indexability conditions and efficient index-computing schemes, and focusing on the infinite-horizon finite-state and -action discounted case, we present a verification theorem ensuring that, if a model satisfies two proposed PCL-indexability conditions with respect to a postulated family of structured policies, then it is indexable and such policies are optimal, with its DAI being given by a marginal productivity index computed by a downshift adaptive-greedy algorithm in A N steps, with A + 1 actions and N states. The DAI is further used as the basis of a new index policy for the multi-armed multi-gear bandit problem .
Keywords: Markov decision process; multi-gear bandits; index policies; indexability; index algorithm (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/10/14/2497/pdf (application/pdf)
https://www.mdpi.com/2227-7390/10/14/2497/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:10:y:2022:i:14:p:2497-:d:865645
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().