EconPapers    
Economics at your fingertips  
 

Novelty is not surprise: Human exploratory and adaptive behavior in sequential decision-making

He A Xu, Alireza Modirshanechi, Marco P Lehmann, Wulfram Gerstner and Michael H Herzog

PLOS Computational Biology, 2021, vol. 17, issue 6, 1-32

Abstract: Classic reinforcement learning (RL) theories cannot explain human behavior in the absence of external reward or when the environment changes. Here, we employ a deep sequential decision-making paradigm with sparse reward and abrupt environmental changes. To explain the behavior of human participants in these environments, we show that RL theories need to include surprise and novelty, each with a distinct role. While novelty drives exploration before the first encounter of a reward, surprise increases the rate of learning of a world-model as well as of model-free action-values. Even though the world-model is available for model-based RL, we find that human decisions are dominated by model-free action choices. The world-model is only marginally used for planning, but it is important to detect surprising events. Our theory predicts human action choices with high probability and allows us to dissociate surprise, novelty, and reward in EEG signals.Author summary: Humans like to explore their environment: children play with toys, tourists explore touristic sites, and readers start a new book. Exploration is useful to build knowledge about the world in the form of a ‘world-model’. However, since the world is complex and changing, the learned world-model is sometimes wrong: if so, the feeling of surprise arises. Here, we distinguish surprise from novelty; we show that humans use surprise as a signal to decide when to adapt their behavior, while they use novelty to decide where and what to explore—to eventually develop an improved world-model. Intuitively, it seems obvious to use world-models to plan future actions. However, we show that in a complex and changing environment where planning needs heavy computations, participants rarely follow an explicit plan and take their actions mainly by shaping habits. Importantly, we show that the main role of their world-model is to signal when to be surprised and, hence, when to adapt their habits. In summary, our results show how surprise and novelty interact with human reinforcement learning, contribute to human adaptive and exploratory behavior, and correlate with EEG signals.

Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1009070 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 09070&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1009070

DOI: 10.1371/journal.pcbi.1009070

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-03-19
Handle: RePEc:plo:pcbi00:1009070