EconPapers    
Economics at your fingertips  
 

Limiting Dynamics for Q-Learning with Memory One in Symmetric Two-Player, Two-Action Games

J. M. Meylahn, L. Janssen and Hassan Zargarzadeh

Complexity, 2022, vol. 2022, 1-20

Abstract: We develop a method based on computer algebra systems to represent the mutual pure strategy best-response dynamics of symmetric two-player, two-action repeated games played by players with a one-period memory. We apply this method to the iterated prisoner’s dilemma, stag hunt, and hawk-dove games and identify all possible equilibrium strategy pairs and the conditions for their existence. The only equilibrium strategy pair that is possible in all three games is the win-stay, lose-shift strategy. Lastly, we show that the mutual best-response dynamics are realized by a sample batch Q-learning algorithm in the infinite batch size limit.

Date: 2022
References: Add references at CitEc
Citations: View citations in EconPapers (3)

Downloads: (external link)
http://downloads.hindawi.com/journals/complexity/2022/4830491.pdf (application/pdf)
http://downloads.hindawi.com/journals/complexity/2022/4830491.xml (application/xml)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:hin:complx:4830491

DOI: 10.1155/2022/4830491

Access Statistics for this article

More articles in Complexity from Hindawi
Bibliographic data for series maintained by Mohamed Abdelhakeem ().

 
Page updated 2025-03-19
Handle: RePEc:hin:complx:4830491