Multiagent Online Learning in Time-Varying Games
Benoit Duvocelle (),
Panayotis Mertikopoulos (),
Mathias Staudigl () and
Dries Vermeulen
Additional contact information
Benoit Duvocelle: Department of Quantitative Economics, Maastricht University, NL–6200 MD Maastricht, Netherlands
Panayotis Mertikopoulos: Université Grenoble Alpes, CNRS, Inria, Grenoble INP, LIG, 38000 Grenoble, France; Criteo AI Lab, 38130 Echirolles, France
Mathias Staudigl: Department of Advanced Computing Sciences, Maastricht University, NL–6200 MD Maastricht, Netherlands
Mathematics of Operations Research, 2023, vol. 48, issue 2, 914-941
Abstract:
We examine the long-run behavior of multiagent online learning in games that evolve over time. Specifically, we focus on a wide class of policies based on mirror descent, and we show that the induced sequence of play (a) converges to a Nash equilibrium in time-varying games that stabilize in the long run to a strictly monotone limit, and (b) it stays asymptotically close to the evolving equilibrium of the sequence of stage games (assuming they are strongly monotone). Our results apply to both gradient- and payoff-based feedback—that is, when players only get to observe the payoffs of their chosen actions.
Keywords: Primary: 91A10; 91A26; secondary: 68Q32; 90C25; dynamic regret; Nash equilibrium; mirror descent; time-varying games (search for similar items in EconPapers)
Date: 2023
References: Add references at CitEc
Citations:
Downloads: (external link)
http://dx.doi.org/10.1287/moor.2022.1283 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:ormoor:v:48:y:2023:i:2:p:914-941
Access Statistics for this article
More articles in Mathematics of Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().