Selective Reviews of Bandit Problems in AI via a Statistical View
Pengjie Zhou,
Haoyu Wei and
Huiming Zhang ()
Additional contact information
Pengjie Zhou: Institute of Artificial Intelligence, Beihang University, Beijing 100191, China
Haoyu Wei: Department of Economics, University of California San Diego, La Jolla, CA 92093, USA
Huiming Zhang: Institute of Artificial Intelligence, Beihang University, Beijing 100191, China
Mathematics, 2025, vol. 13, issue 4, 1-54
Abstract:
Reinforcement Learning (RL) is a widely researched area in artificial intelligence that focuses on teaching agents decision-making through interactions with their environment. A key subset includes multi-armed bandit (MAB) and stochastic continuum-armed bandit (SCAB) problems, which model sequential decision-making under uncertainty. This review outlines the foundational models and assumptions of bandit problems, explores non-asymptotic theoretical tools like concentration inequalities and minimax regret bounds, and compares frequentist and Bayesian algorithms for managing exploration–exploitation trade-offs. Additionally, we explore K -armed contextual bandits and SCAB, focusing on their methodologies and regret analyses. We also examine the connections between SCAB problems and functional data analysis. Finally, we highlight recent advances and ongoing challenges in the field.
Keywords: bandit problems; exploration–exploitation; concentration inequalities; sub-Gaussian parameter estimation; minimax rate; functional data analysis (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/13/4/665/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/4/665/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:4:p:665-:d:1593909
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().