EconPapers    
Economics at your fingertips  
 

Trial-by-trial learning of successor representations in human behavior

Ari E Kahn, Dani S Bassett and Nathaniel D Daw

PLOS Computational Biology, 2025, vol. 21, issue 11, 1-18

Abstract: Decisions in humans and other organisms depend, in part, on learning and using models that capture the statistical structure of the world, including the long-run expected outcomes of our actions. One prominent approach to forecasting such long-run outcomes is the successor representation (SR), which predicts future states aggregated over multiple timesteps. Although much behavioral and neural evidence suggests that people and animals use such a representation, it remains unknown how they acquire it. It has frequently been assumed to be learned by temporal difference bootstrapping (SR-TD(0)), but this assumption has largely not been empirically tested or compared to alternatives including eligibility traces (SR-TD(λ>0)). Here we address this gap by leveraging trial-by-trial reaction times in graph sequence learning tasks, which are favorable for studying learning dynamics because the long horizons in these studies differentiate the transient update dynamics of different learning rules. We examined the behavior of SR-TD(λ) on a probabilistic graph learning task alongside a number of alternatives, and found that behavior was best explained by a hybrid model which learned via SR-TD(λ) alongside an additional predictive model of recency. The relatively large λ we estimate indicates a predominant role of eligibility trace mechanisms over the bootstrap-based chaining typically assumed. Our results provide insight into how humans learn predictive representations, and demonstrate that people simultaneously learn the SR alongside lower-order predictions.Author summary: Our ability to plan intelligently requires predicting the state of the world multiple steps into the future. Enumerating future outcomes step-by-step, however, is slow and costly. Instead, research has shown that people rely on simplified models of the world that skip across multiple steps at once. How do we construct these simplified models? One promising idea is the successor representation (SR), which predicts future events via a simple and neurally plausible computation. The SR has been shown to explain a range of behavioral phenomena, but these studies have not identified which among many learning rules the brain uses to build the SR. Plausible mechanisms for learning associations over delays (called bootstrapping and eligibility traces) both converge to identical simplified world models, and thus existing studies on the SR, which focus on well trained behavior, are unable to distinguish between them. Here, we answer this question by examining behavior on a graph learning task, where stimulus-by-stimulus reaction times have been shown to reflect predictions over long temporal horizons. Through both model fitting and model-agnostic comparisons, we find that behavior is best explained by a learning rule heavily dependent on eligibility traces, in contrast to previous work which generally assumed an (untested) bootstrapping update rule.

Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013696 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 13696&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1013696

DOI: 10.1371/journal.pcbi.1013696

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-11-29
Handle: RePEc:plo:pcbi00:1013696