Optimizing the depth and the direction of prospective planning using information values
Can Eren Sezener,
Amir Dezfouli and
Mehdi Keramati
PLOS Computational Biology, 2019, vol. 15, issue 3, 1-21
Abstract:
Evaluating the future consequences of actions is achievable by simulating a mental search tree into the future. Expanding deep trees, however, is computationally taxing. Therefore, machines and humans use a plan-until-habit scheme that simulates the environment up to a limited depth and then exploits habitual values as proxies for consequences that may arise in the future. Two outstanding questions in this scheme are “in which directions the search tree should be expanded?”, and “when should the expansion stop?”. Here we propose a principled solution to these questions based on a speed/accuracy tradeoff: deeper expansion in the appropriate directions leads to more accurate planning, but at the cost of slower decision-making. Our simulation results show how this algorithm expands the search tree effectively and efficiently in a grid-world environment. We further show that our algorithm can explain several behavioral patterns in animals and humans, namely the effect of time-pressure on the depth of planning, the effect of reward magnitudes on the direction of planning, and the gradual shift from goal-directed to habitual behavior over the course of training. The algorithm also provides several predictions testable in animal/human experiments.Author summary: When faced with several choices in complex environments like chess, thinking about all the potential consequences of each choice, infinitely deep into the future, is simply impossible due to time and cognitive limitations. An outstanding question is what is the best direction and depth of thinking about the future? Here we propose a mathematical algorithm that computes, along the course of planning, the benefit of thinking another step in a given direction into the future, and compares that with the cost of thinking in order to compute the net benefit. We show that this algorithm is consistent with several behavioral patterns observed in humans and animals, suggesting that they, too, make efficient use of their time and cognitive resources when deciding how deep to think.
Date: 2019
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006827 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 06827&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1006827
DOI: 10.1371/journal.pcbi.1006827
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().