EconPapers    
Economics at your fingertips  
 

Latent Structure Matching for Knowledge Transfer in Reinforcement Learning

Yi Zhou and Fenglei Yang
Additional contact information
Yi Zhou: School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
Fenglei Yang: School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China

Future Internet, 2020, vol. 12, issue 2, 1-15

Abstract: Reinforcement learning algorithms usually require a large number of empirical samples and give rise to a slow convergence in practical applications. One solution is to introduce transfer learning: Knowledge from well-learned source tasks can be reused to reduce sample request and accelerate the learning of target tasks. However, if an unmatched source task is selected, it will slow down or even disrupt the learning procedure. Therefore, it is very important for knowledge transfer to select appropriate source tasks that have a high degree of matching with target tasks. In this paper, a novel task matching algorithm is proposed to derive the latent structures of value functions of tasks, and align the structures for similarity estimation. Through the latent structure matching, the highly-matched source tasks are selected effectively, from which knowledge is then transferred to give action advice, and improve exploration strategies of the target tasks. Experiments are conducted on the simulated navigation environment and the mountain car environment. The results illustrate the significant performance gain of the improved exploration strategy, compared with traditional ϵ -greedy exploration strategy. A theoretical proof is also given to verify the improvement of the exploration strategy based on latent structure matching.

Keywords: latent structure matching; reinforcement learning; transfer learning; action advice; policy improvement; mountain car (search for similar items in EconPapers)
JEL-codes: O3 (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/1999-5903/12/2/36/pdf (application/pdf)
https://www.mdpi.com/1999-5903/12/2/36/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jftint:v:12:y:2020:i:2:p:36-:d:320406

Access Statistics for this article

Future Internet is currently edited by Ms. Grace You

More articles in Future Internet from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-04-18
Handle: RePEc:gam:jftint:v:12:y:2020:i:2:p:36-:d:320406