The Bounds of Algorithmic Collusion: Q-learning, Gradient Learning, and the Folk Theorem
Galit Askenazi-Golan (),
Domenico Mergoni Cecchelli (),
Edward Plumb () and
Clemens Possnig ()
Additional contact information
Galit Askenazi-Golan: Department of Mathematics, The London School of Economics and Political Science
Domenico Mergoni Cecchelli: Department of Mathematics, The London School of Economics and Political Science
Edward Plumb: Department of Mathematics, The London School of Economics and Political Science
Clemens Possnig: School of Economics, University of Waterloo
No 26002, Working Papers from University of Waterloo, Department of Economics
Abstract:
We explore the behaviour emerging from learning agents repeatedly interacting strategically for a wide range of learning dynamics, including Q-learning, projected gradient, replicator and log-barrier dynamics. Going beyond the better understood classes of potential games and zero-sum games, we consider the setting of a general repeated game with finite recall under different forms of monitoring. We obtain a Folk Theorem-style result and characterise the set of payoff vectors that can be obtained by these dynamics, discovering a wide range of possibilities for the emergence of algorithmic collusion. Achieving this requires a novel technical approach, which, to the best of our knowledge, yields the first convergence result for multi-agent Q-learning algorithms in repeated games.
Keywords: Q-learning; projected gradient; replicator; log-barrier dynamics (search for similar items in EconPapers)
Pages: 45 pages
Date: 2026-03-03
New Economics Papers: this item is included in nep-mic
References: Add references at CitEc
Citations:
Forthcoming
Downloads: (external link)
https://hdl.handle.net/10012/23583 First version, 2026 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wat:wpaper:26002
Access Statistics for this paper
More papers in Working Papers from University of Waterloo, Department of Economics Contact information at EDIRC.
Bibliographic data for series maintained by Sherri Anne Arsenault ().