Learning to Charge More: A Theoretical Study of Collusion by Q-Learning Agents

Chica, Cristian; Guo, Yinglong; Lerman, Gilad

Learning to Charge More: A Theoretical Study of Collusion by Q-Learning Agents

Cristian Chica, Yinglong Guo and Gilad Lerman

Abstract: There is growing experimental evidence that $Q$-learning agents may learn to charge supracompetitive prices. We provide the first theoretical explanation for this behavior in infinite repeated games. Firms update their pricing policies based solely on observed profits, without computing equilibrium strategies. We show that when the game admits both a one-stage Nash equilibrium price and a collusive-enabling price, and when the $Q$-function satisfies certain inequalities at the end of experimentation, firms learn to consistently charge supracompetitive prices. We introduce a new class of one-memory subgame perfect equilibria (SPEs) and provide conditions under which learned behavior is supported by naive collusion, grim trigger policies, or increasing strategies. Naive collusion does not constitute an SPE unless the collusive-enabling price is a one-stage Nash equilibrium, whereas grim trigger policies can.

Date: 2025-05
New Economics Papers: this item is included in nep-com and nep-mic
References: Add references at CitEc
Citations:

Downloads: (external link)
http://arxiv.org/pdf/2505.22909 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2505.22909

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().