Value enhancement of reinforcement learning via efficient and robust trust region optimization

Shi, Chengchun; Qi, Zhengling; Wang, Jianing; Zhou, Fan

Value enhancement of reinforcement learning via efficient and robust trust region optimization

Chengchun Shi, Zhengling Qi, Jianing Wang and Fan Zhou

LSE Research Online Documents on Economics from London School of Economics and Political Science, LSE Library

Abstract: Reinforcement learning (RL) is a powerful machine learning technique that enables an intelligent agent to learn an optimal policy that maximizes the cumulative rewards in sequential decision making. Most of methods in the existing literature are developed in online settings where the data are easy to collect or simulate. Motivated by high stake domains such as mobile health studies with limited and pre-collected data, in this article, we study offline reinforcement learning methods. To efficiently use these datasets for policy optimization, we propose a novel value enhancement method to improve the performance of a given initial policy computed by existing state-of-the-art RL algorithms. Specifically, when the initial policy is not consistent, our method will output a policy whose value is no worse and often better than that of the initial policy. When the initial policy is consistent, under some mild conditions, our method will yield a policy whose value converges to the optimal one at a faster rate than the initial policy, achieving the desired“value enhancement” property. The proposed method is generally applicable to any parameterized policy that belongs to certain pre-specified function class (e.g., deep neural networks). Extensive numerical studies are conducted to demonstrate the superior performance of our method. Supplementary materials for this article are available online.

Keywords: mobile health studies; offline reinforcement learning; semi-parametric efficiency; trust region optimization (search for similar items in EconPapers)
JEL-codes: C1 (search for similar items in EconPapers)
Date: 2023-07-20
New Economics Papers: this item is included in nep-big and nep-cmp
References: View references in EconPapers View complete reference list from CitEc
Citations:

Published in Journal of the American Statistical Association, 20, July, 2023, pp. 1-15. ISSN: 0162-1459

Downloads: (external link)
https://researchonline.lse.ac.uk/id/eprint/122756/ Open access version. (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ehl:lserod:122756

Access Statistics for this paper

More papers in LSE Research Online Documents on Economics from London School of Economics and Political Science, LSE Library LSE Library Portugal Street London, WC2A 2HD, U.K.. Contact information at EDIRC.
Bibliographic data for series maintained by LSERO Manager ().