The Computational Development of Reinforcement Learning during Adolescence
Stefano Palminteri,
Emma J Kilford,
Giorgio Coricelli and
Sarah-Jayne Blakemore
PLOS Computational Biology, 2016, vol. 12, issue 6, 1-25
Abstract:
Adolescence is a period of life characterised by changes in learning and decision-making. Learning and decision-making do not rely on a unitary system, but instead require the coordination of different cognitive processes that can be mathematically formalised as dissociable computational modules. Here, we aimed to trace the developmental time-course of the computational modules responsible for learning from reward or punishment, and learning from counterfactual feedback. Adolescents and adults carried out a novel reinforcement learning paradigm in which participants learned the association between cues and probabilistic outcomes, where the outcomes differed in valence (reward versus punishment) and feedback was either partial or complete (either the outcome of the chosen option only, or the outcomes of both the chosen and unchosen option, were displayed). Computational strategies changed during development: whereas adolescents’ behaviour was better explained by a basic reinforcement learning algorithm, adults’ behaviour integrated increasingly complex computational features, namely a counterfactual learning module (enabling enhanced performance in the presence of complete feedback) and a value contextualisation module (enabling symmetrical reward and punishment learning). Unlike adults, adolescent performance did not benefit from counterfactual (complete) feedback. In addition, while adults learned symmetrically from both reward and punishment, adolescents learned from reward but were less likely to learn from punishment. This tendency to rely on rewards and not to consider alternative consequences of actions might contribute to our understanding of decision-making in adolescence.Author Summary: We employed a novel learning task to investigate how adolescents and adults learn from reward versus punishment, and to counterfactual feedback about decisions. Computational analyses revealed that adults and adolescents did not implement the same algorithm to solve the learning task. In contrast to adults, adolescents’ performance did not take into account counterfactual information; adolescents also learned preferentially to seek rewards rather than to avoid punishments, whereas adults learned to seek and avoid both equally. Increasing our understanding of computational changes in reinforcement learning during adolescence may provide insights into adolescent value-based decision-making. Our results might also have implications for education, since they suggest that adolescents benefit more from positive feedback than from negative feedback in learning tasks.
Date: 2016
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (6)
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004953 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 04953&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1004953
DOI: 10.1371/journal.pcbi.1004953
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().