Cycles and collusion in congestion games under Q-learning
Cesare Carissimo,
Jan Nagler and
Heinrich Nax
Papers from arXiv.org
Abstract:
We investigate the dynamics of Q-learning in a class of generalized Braess paradox games. These games represent an important class of network routing games where the associated stage-game Nash equilibria do not constitute social optima. We provide a full convergence analysis of Q-learning with varying parameters and learning rates. A wide range of phenomena emerges, broadly either settling into Nash or cycling continuously in ways reminiscent of "Edgeworth cycles" (i.e. jumping suddenly from Nash toward social optimum and then deteriorating gradually back to Nash). Our results reveal an important incentive incompatibility when thinking in terms of a meta-game being played by the designers of the individual Q-learners who set their agents' parameters. Indeed, Nash equilibria of the meta-game are characterized by heterogeneous parameters, and resulting outcomes achieve little to no cooperation beyond Nash. In conclusion, we suggest a novel perspective for thinking about regulation and collusion, and discuss the implications of our results for Bertrand oligopoly pricing games.
Date: 2025-02
New Economics Papers: this item is included in nep-inv and nep-mic
References: Add references at CitEc
Citations:
Downloads: (external link)
http://arxiv.org/pdf/2502.18984 Latest version (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2502.18984
Access Statistics for this paper
More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().