Entropy-regularized penalization schemes for American options and reflected BSDEs with singular generators
Daniel Chee,
Noufel Frikha () and
Libo Li ()
Additional contact information
Daniel Chee: School of Mathematics and Statistics, University of New South Wales, Sydney, NSW 2052, Australia
Noufel Frikha: CES - Centre d'économie de la Sorbonne - UP1 - Université Paris 1 Panthéon-Sorbonne - CNRS - Centre National de la Recherche Scientifique, UP1 - Université Paris 1 Panthéon-Sorbonne
Libo Li: School of Mathematics and Statistics, University of New South Wales, Sydney, NSW 2052, Australia
Working Papers from HAL
Abstract:
This paper extends our previous work in Chee et al. (2025) to continuous-time optimal stopping problems, with a particular focus on American options within an exploratory framework. We pursue two main objectives. First, motivated by reinforcement learning applications, we introduce an entropy-regularized penalization scheme for continuous-time optimal stopping problems. The scheme is inspired by classical penalization techniques for reflected backward stochastic differential equations (RBSDEs) and provides a smooth approximation of the degenerate stopping rule inherent to the American option problem. This regularization promotes exploration, enables the use of gradient-based optimization methods, and leads naturally to policy improvement algorithms. We establish well-posedness and convergence properties of the scheme, and illustrate its numerical feasibility through low-dimensional experiments based on policy iteration and least-squares Monte Carlo methods. Second, from a theoretical perspective, we study the asymptotic limit of the entropy-regularized penalization as the penalization parameter tends to infinity. We show that the limiting value process solves a reflected BSDE with a logarithmically singular driver, and we prove existence and uniqueness of solutions to this new class of RBSDEs via a monotone limit argument. To the best of our knowledge, such equations have not previously been investigated in the literature.
Keywords: American option; Optimal stopping; Reflected Backward Stochastic Differential Equation; Entropy regularization; Policy improvement algorithm (search for similar items in EconPapers)
Date: 2026-02-20
Note: View the original document on HAL open archive server: https://hal.science/hal-05520660v1
References: Add references at CitEc
Citations:
Downloads: (external link)
https://hal.science/hal-05520660v1/document (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hal:wpaper:hal-05520660
Access Statistics for this paper
More papers in Working Papers from HAL
Bibliographic data for series maintained by CCSD ().