On Convergence Rate of MRetrace
Xingguo Chen,
Wangrong Qin,
Yu Gong,
Shangdong Yang and
Wenhao Wang ()
Additional contact information
Xingguo Chen: Jiangsu Key Laboratory of Big Data Security & Intelligent Processing, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
Wangrong Qin: Jiangsu Key Laboratory of Big Data Security & Intelligent Processing, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
Yu Gong: Jiangsu Key Laboratory of Big Data Security & Intelligent Processing, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
Shangdong Yang: Jiangsu Key Laboratory of Big Data Security & Intelligent Processing, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
Wenhao Wang: College of Electronic Engineering, National University of Defense Technology, Changsha 410073, China
Mathematics, 2024, vol. 12, issue 18, 1-19
Abstract:
Off-policy is a key setting for reinforcement learning algorithms. In recent years, the stability of off-policy learning for value-based reinforcement learning has been guaranteed even when combined with linear function approximation and bootstrapping. Convergence rate analysis is currently a hot topic. However, the convergence rates of learning algorithms vary, and analyzing the reasons behind this remains an open problem. In this paper, we propose an essentially simplified version of a convergence rate to generate general off-policy temporal difference learning algorithms. We emphasize that the primary determinant influencing convergence rate is the minimum eigenvalue of the key matrix. Furthermore, we conduct a comparative analysis of the influencing factor across various off-policy learning algorithms in diverse numerical scenarios. The experimental findings validate the proposed determinant, which serves as a benchmark for the design of more efficient learning algorithms.
Keywords: finite sample analysis; off-policy learning; minimum eigenvalues; MRetrace (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/12/18/2930/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/18/2930/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:18:p:2930-:d:1482100
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().