EconPapers    
Economics at your fingertips  
 

Robust Exploratory Stopping under Ambiguity in Reinforcement Learning

Junyan Ye, Hoi Ying Wong and Kyunghyun Park

Papers from arXiv.org

Abstract: We propose and analyze a continuous-time robust reinforcement learning framework for optimal stopping under ambiguity. In this framework, an agent chooses a robust exploratory stopping time motivated by two objectives: robust decision-making under ambiguity and learning about the unknown environment. Here, ambiguity refers to considering multiple probability measures dominated by a reference measure, reflecting the agent's awareness that the reference measure representing her learned belief about the environment would be erroneous. Using the $g$-expectation framework, we reformulate the optimal stopping problem under ambiguity as a robust exploratory control problem with Bernoulli distributed controls. We then characterize the optimal Bernoulli distributed control via backward stochastic differential equations and, based on this, construct the robust exploratory stopping time that approximates the optimal stopping time under ambiguity. Last, we establish a policy iteration theorem and implement it as a reinforcement learning algorithm. Numerical experiments demonstrate the convergence, robustness, and scalability of our reinforcement learning algorithm across different levels of ambiguity and exploration.

Date: 2025-10, Revised 2026-04
New Economics Papers: this item is included in nep-cmp
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://arxiv.org/pdf/2510.10260 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2510.10260

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().

 
Page updated 2026-04-24
Handle: RePEc:arx:papers:2510.10260