Learning to Utilize Curiosity: A New Approach of Automatic Curriculum Learning for Deep RL
Zeyang Lin,
Jun Lai,
Xiliang Chen,
Lei Cao and
Jun Wang
Additional contact information
Zeyang Lin: Command Control Engineering College, Army Engineering University of PLA, Nanjing 210007, China
Jun Lai: Command Control Engineering College, Army Engineering University of PLA, Nanjing 210007, China
Xiliang Chen: Command Control Engineering College, Army Engineering University of PLA, Nanjing 210007, China
Lei Cao: Command Control Engineering College, Army Engineering University of PLA, Nanjing 210007, China
Jun Wang: Command Control Engineering College, Army Engineering University of PLA, Nanjing 210007, China
Mathematics, 2022, vol. 10, issue 14, 1-20
Abstract:
In recent years, reinforcement learning algorithms based on automatic curriculum learning have been increasingly applied to multi-agent system problems. However, in the sparse reward environment, the reinforcement learning agents get almost no feedback from the environment during the whole training process, which leads to a decrease in the convergence speed and learning efficiency of the curriculum reinforcement learning algorithm. Based on the automatic curriculum learning algorithm, this paper proposes a curriculum reinforcement learning method based on the curiosity model (CMCL). The method divides the curriculum sorting criteria into temporal-difference error and curiosity reward, uses the K-fold cross validation method to evaluate the difficulty priority of task samples, uses the Intrinsic Curiosity Module (ICM) to evaluate the curiosity priority of the task samples, and uses the curriculum factor to adjust the learning probability of the task samples. This study compares the CMCL algorithm with other baseline algorithms in cooperative-competitive environments, and the experimental simulation results show that the CMCL method can improve the training performance and robustness of multi-agent deep reinforcement learning algorithms.
Keywords: deep reinforcement learning; automatic curriculum learning; curiosity; sparse reward (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/10/14/2523/pdf (application/pdf)
https://www.mdpi.com/2227-7390/10/14/2523/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:10:y:2022:i:14:p:2523-:d:867199
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().