EconPapers    
Economics at your fingertips  
 

Optimism and pessimism in optimised replay

Georgy Antonov, Christopher Gagne, Eran Eldar and Peter Dayan

PLOS Computational Biology, 2022, vol. 18, issue 1, 1-32

Abstract: The replay of task-relevant trajectories is known to contribute to memory consolidation and improved task performance. A wide variety of experimental data show that the content of replayed sequences is highly specific and can be modulated by reward as well as other prominent task variables. However, the rules governing the choice of sequences to be replayed still remain poorly understood. One recent theoretical suggestion is that the prioritization of replay experiences in decision-making problems is based on their effect on the choice of action. We show that this implies that subjects should replay sub-optimal actions that they dysfunctionally choose rather than optimal ones, when, by being forgetful, they experience large amounts of uncertainty in their internal models of the world. We use this to account for recent experimental data demonstrating exactly pessimal replay, fitting model parameters to the individual subjects’ choices.Author summary: When animals are asleep or restfully awake, populations of neurons in their brains recapitulate activity associated with extended behaviourally-relevant experiences. This process is called replay, and it has been established for a long time in rodents, and very recently in humans, to be important for good performance in decision-making tasks. The specific experiences which are replayed during those epochs follow highly ordered patterns, but the mechanisms which establish their priority are still not fully understood. One promising theoretical suggestion is that each replay experience is chosen in such a way that the learning that ensues is most helpful for the subsequent performance of the animal. A very recent study reported a surprising result that humans who achieved high performance in a planning task tended to replay actions they found to be sub-optimal, and that this was associated with a useful deprecation of those actions in subsequent performance. In this study, we examine the nature of this pessimized form of replay and show that it is exactly appropriate for forgetful agents. We analyse the role of forgetting for replay choices of our model, and verify our predictions using human subject data.

Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1009634 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 09634&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1009634

DOI: 10.1371/journal.pcbi.1009634

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-03-22
Handle: RePEc:plo:pcbi00:1009634