COMBINING CORRELATION-BASED AND REWARD-BASED LEARNING IN NEURAL CONTROL FOR POLICY IMPROVEMENT
Poramate Manoonpong (),
Christoph Kolodziejski (),
Florentin Wörgötter () and
Jun Morimoto ()
Additional contact information
Poramate Manoonpong: Bernstein Center for Computational Neuroscience, The Third Institute of Physics, University of Göttingen, Göttingen 37077, Germany;
Christoph Kolodziejski: Bernstein Center for Computational Neuroscience, The Third Institute of Physics, University of Göttingen, Göttingen 37077, Germany
Florentin Wörgötter: Bernstein Center for Computational Neuroscience, The Third Institute of Physics, University of Göttingen, Göttingen 37077, Germany
Jun Morimoto: Bernstein Center for Computational Neuroscience, The Third Institute of Physics, University of Göttingen, Göttingen 37077, Germany;
Advances in Complex Systems (ACS), 2013, vol. 16, issue 02n03, 1-38
Abstract:
Classical conditioning (conventionally modeled as correlation-based learning) and operant conditioning (conventionally modeled as reinforcement learning or reward-based learning) have been found in biological systems. Evidence shows that these two mechanisms strongly involve learning about associations. Based on these biological findings, we propose a new learning model to achieve successful control policies for artificial systems. This model combines correlation-based learning using input correlation learning (ICO learning) and reward-based learning using continuous actor–critic reinforcement learning (RL), thereby working as a dual learner system. The model performance is evaluated by simulations of a cart-pole system as a dynamic motion control problem and a mobile robot system as a goal-directed behavior control problem. Results show that the model can strongly improve pole balancing control policy, i.e., it allows the controller to learn stabilizing the pole in the largest domain of initial conditions compared to the results obtained when using a single learning mechanism. This model can also find a successful control policy for goal-directed behavior, i.e., the robot can effectively learn to approach a given goal compared to its individual components. Thus, the study pursued here sharpens our understanding of how two different learning mechanisms can be combined and complement each other for solving complex tasks.
Keywords: Classical conditioning; operant conditioning; associative learning; reinforcement learning; pole balancing; goal-directed behavior (search for similar items in EconPapers)
Date: 2013
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S021952591350015X
Access to full text is restricted to subscribers
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wsi:acsxxx:v:16:y:2013:i:02n03:n:s021952591350015x
Ordering information: This journal article can be ordered from
DOI: 10.1142/S021952591350015X
Access Statistics for this article
Advances in Complex Systems (ACS) is currently edited by Frank Schweitzer
More articles in Advances in Complex Systems (ACS) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().