Limiting Dynamics for Q-Learning with Memory One in Symmetric Two-Player, Two-Action Games
J. M. Meylahn,
L. Janssen and
Hassan Zargarzadeh
Complexity, 2022, vol. 2022, 1-20
Abstract:
We develop a method based on computer algebra systems to represent the mutual pure strategy best-response dynamics of symmetric two-player, two-action repeated games played by players with a one-period memory. We apply this method to the iterated prisoner’s dilemma, stag hunt, and hawk-dove games and identify all possible equilibrium strategy pairs and the conditions for their existence. The only equilibrium strategy pair that is possible in all three games is the win-stay, lose-shift strategy. Lastly, we show that the mutual best-response dynamics are realized by a sample batch Q-learning algorithm in the infinite batch size limit.
Date: 2022
References: Add references at CitEc
Citations: View citations in EconPapers (3)
Downloads: (external link)
http://downloads.hindawi.com/journals/complexity/2022/4830491.pdf (application/pdf)
http://downloads.hindawi.com/journals/complexity/2022/4830491.xml (application/xml)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hin:complx:4830491
DOI: 10.1155/2022/4830491
Access Statistics for this article
More articles in Complexity from Hindawi
Bibliographic data for series maintained by Mohamed Abdelhakeem ().