Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region Optimization

Shi, Chengchun; Qi, Zhengling; Wang, Jianing; Zhou, Fan

Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region Optimization

Chengchun Shi, Zhengling Qi, Jianing Wang and Fan Zhou

Journal of the American Statistical Association, 2024, vol. 119, issue 547, 2011-2025

Abstract: Reinforcement learning (RL) is a powerful machine learning technique that enables an intelligent agent to learn an optimal policy that maximizes the cumulative rewards in sequential decision making. Most of methods in the existing literature are developed in online settings where the data are easy to collect or simulate. Motivated by high stake domains such as mobile health studies with limited and pre-collected data, in this article, we study offline reinforcement learning methods. To efficiently use these datasets for policy optimization, we propose a novel value enhancement method to improve the performance of a given initial policy computed by existing state-of-the-art RL algorithms. Specifically, when the initial policy is not consistent, our method will output a policy whose value is no worse and often better than that of the initial policy. When the initial policy is consistent, under some mild conditions, our method will yield a policy whose value converges to the optimal one at a faster rate than the initial policy, achieving the desired “value enhancement” property. The proposed method is generally applicable to any parameterized policy that belongs to certain pre-specified function class (e.g., deep neural networks). Extensive numerical studies are conducted to demonstrate the superior performance of our method. Supplementary materials for this article are available online.

Date: 2024
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://hdl.handle.net/10.1080/01621459.2023.2238942 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:taf:jnlasa:v:119:y:2024:i:547:p:2011-2025

Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/UASA20

DOI: 10.1080/01621459.2023.2238942

Access Statistics for this article

Journal of the American Statistical Association is currently edited by Xuming He, Jun Liu, Joseph Ibrahim and Alyson Wilson

More articles in Journal of the American Statistical Association from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().