Multiplayer Bandits Without Observing Collision Information
Gábor Lugosi () and
Abbas Mehrabian ()
Additional contact information
Gábor Lugosi: Department of Economics and Business, Pompeu Fabra University, Barcelona 08010, Spain; Barcelona Graduate School of Economics, Barcelona 08005, Spain; ICREA, Pg. Lluís Companys 23, Barcelona 08010, Spain
Abbas Mehrabian: School of Computer Science, McGill University, Montréal, Quebec H3A 0E9, Canada
Mathematics of Operations Research, 2022, vol. 47, issue 2, 1247-1265
Abstract:
We study multiplayer stochastic multiarmed bandit problems in which the players cannot communicate, and if two or more players pull the same arm, a collision occurs and the involved players receive zero reward. We consider two feedback models: a model in which the players can observe whether a collision has occurred and a more difficult setup in which no collision information is available. We give the first theoretical guarantees for the second model: an algorithm with a logarithmic regret and an algorithm with a square-root regret that does not depend on the gaps between the means. For the first model, we give the first square-root regret bounds that do not depend on the gaps. Building on these ideas, we also give an algorithm for reaching approximate Nash equilibria quickly in stochastic anticoordination games.
Keywords: Primary: 68Q32; secondary: 62L12; 68W15; 91A15; multiplayer bandits; distributed learning; sequential decision making; decentralized algorithms; anticoordination games; opportunistic spectrum access (search for similar items in EconPapers)
Date: 2022
References: Add references at CitEc
Citations:
Downloads: (external link)
http://dx.doi.org/10.1287/moor.2021.1168 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:ormoor:v:47:y:2022:i:2:p:1247-1265
Access Statistics for this article
More articles in Mathematics of Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().