Performance-gated deliberation: A context-adapted strategy in which urgency is opportunity cost
Maximilian Puelma Touzel,
Paul Cisek and
Guillaume Lajoie
PLOS Computational Biology, 2022, vol. 18, issue 5, 1-33
Abstract:
Finding the right amount of deliberation, between insufficient and excessive, is a hard decision making problem that depends on the value we place on our time. Average-reward, putatively encoded by tonic dopamine, serves in existing reinforcement learning theory as the opportunity cost of time, including deliberation time. Importantly, this cost can itself vary with the environmental context and is not trivial to estimate. Here, we propose how the opportunity cost of deliberation can be estimated adaptively on multiple timescales to account for non-stationary contextual factors. We use it in a simple decision-making heuristic based on average-reward reinforcement learning (AR-RL) that we call Performance-Gated Deliberation (PGD). We propose PGD as a strategy used by animals wherein deliberation cost is implemented directly as urgency, a previously characterized neural signal effectively controlling the speed of the decision-making process. We show PGD outperforms AR-RL solutions in explaining behaviour and urgency of non-human primates in a context-varying random walk prediction task and is consistent with relative performance and urgency in a context-varying random dot motion task. We make readily testable predictions for both neural activity and behaviour.Author summary: The value we place on our time impacts what we choose to do with it. Value our time too little, and we obsess over all details. Value it too much, and we rush carelessly to move on. How we value our time and how this value affects how much of it we allocate to tasks is not well-understood. The related cognitive processes are nevertheless thought to play a role in a wide range of diseases from Parkinson’s to addiction. We propose a general strategy that balances the expected value of deliberation with the time spent, where time is valued according to recent performance. We found that recorded behaviour and brain activity from a previous experiment using non-human primates could be explained by this simple decision-making strategy. We show that this strategy explains how a brain signal called ‘urgency’, which limits how long subjects deliberate, varies with context. Our work helps to integrate the neuroscience of reward representations and the brain dynamics associated with deliberation.
Date: 2022
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010080 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 10080&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1010080
DOI: 10.1371/journal.pcbi.1010080
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().