Intrinsic fluctuations of reinforcement learning promote cooperation
Wolfram Barfuss and
Janusz Meylahn
Papers from arXiv.org
Abstract:
In this work, we ask for and answer what makes classical temporal-difference reinforcement learning with epsilon-greedy strategies cooperative. Cooperating in social dilemma situations is vital for animals, humans, and machines. While evolutionary theory revealed a range of mechanisms promoting cooperation, the conditions under which agents learn to cooperate are contested. Here, we demonstrate which and how individual elements of the multi-agent learning setting lead to cooperation. We use the iterated Prisoner's dilemma with one-period memory as a testbed. Each of the two learning agents learns a strategy that conditions the following action choices on both agents' action choices of the last round. We find that next to a high caring for future rewards, a low exploration rate, and a small learning rate, it is primarily intrinsic stochastic fluctuations of the reinforcement learning process which double the final rate of cooperation to up to 80%. Thus, inherent noise is not a necessary evil of the iterative learning process. It is a critical asset for the learning of cooperation. However, we also point out the trade-off between a high likelihood of cooperative behavior and achieving this in a reasonable amount of time. Our findings are relevant for purposefully designing cooperative algorithms and regulating undesired collusive effects.
Date: 2022-09, Revised 2023-02
New Economics Papers: this item is included in nep-cbe, nep-evo and nep-gth
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)
Published in Scientific Reports 13, 1309 (2023)
Downloads: (external link)
http://arxiv.org/pdf/2209.01013 Latest version (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2209.01013
Access Statistics for this paper
More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().