Deterministic response strategies in a trial-and-error learning task

Mohr, Holger; Zwosta, Katharina; Markovic, Dimitrije; Bitzer, Sebastian; Wolfensteller, Uta; Ruge, Hannes

Deterministic response strategies in a trial-and-error learning task

Holger Mohr, Katharina Zwosta, Dimitrije Markovic, Sebastian Bitzer, Uta Wolfensteller and Hannes Ruge

PLOS Computational Biology, 2018, vol. 14, issue 11, 1-19

Abstract: Trial-and-error learning is a universal strategy for establishing which actions are beneficial or harmful in new environments. However, learning stimulus-response associations solely via trial-and-error is often suboptimal, as in many settings dependencies among stimuli and responses can be exploited to increase learning efficiency. Previous studies have shown that in settings featuring such dependencies, humans typically engage high-level cognitive processes and employ advanced learning strategies to improve their learning efficiency. Here we analyze in detail the initial learning phase of a sample of human subjects (N = 85) performing a trial-and-error learning task with deterministic feedback and hidden stimulus-response dependencies. Using computational modeling, we find that the standard Q-learning model cannot sufficiently explain human learning strategies in this setting. Instead, newly introduced deterministic response models, which are theoretically optimal and transform stimulus sequences unambiguously into response sequences, provide the best explanation for 50.6% of the subjects. Most of the remaining subjects either show a tendency towards generic optimal learning (21.2%) or at least partially exploit stimulus-response dependencies (22.3%), while a few subjects (5.9%) show no clear preference for any of the employed models. After the initial learning phase, asymptotic learning performance during the subsequent practice phase is best explained by the standard Q-learning model. Our results show that human learning strategies in the presented trial-and-error learning task go beyond merely associating stimuli and responses via incremental reinforcement. Specifically during initial learning, high-level cognitive processes support sophisticated learning strategies that increase learning efficiency while keeping memory demands and computational efforts bounded. The good asymptotic fit of the Q-learning model indicates that these cognitive processes are successively replaced by the formation of stimulus-response associations over the course of learning.Author summary: Humans and other animals can learn how to respond to novel stimuli by incrementally strengthening or weakening associations between stimuli and responses based on feedback. Q-learning, which is based on a delta learning rule, has been established as the standard computational model for associative learning. By comparing the Q-learning model with alternative computational models, we investigate human learning strategies in a simple trial-and-error learning task, where stimuli mapped onto responses one-to-one and correct responses were invariably rewarded. We find that humans can learn more efficiently than predicted by the Q-learning model in this setting. Specifically, we show that some subjects systematically went through the response options and made inferences across stimuli to improve their learning speed and avoid unnecessary errors during the initial learning phase. However, after the initial learning phase, the Q-learning model provided a better prediction than the competing models. We conclude that human learning behavior in our experimental task can be best explained as a mixture of sophisticated learning strategies involving high-level cognitive processes at the beginning of learning, and associative learning facilitating further performance improvements at later learning stages.

Date: 2018
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006621 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 06621&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1006621

DOI: 10.1371/journal.pcbi.1006621

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().