Interpretable Optimal Stopping

Ciocan, Dragos Florin; Mišić, Velibor V.

Interpretable Optimal Stopping

Dragos Florin Ciocan () and Velibor V. Mišić ()
Additional contact information
Dragos Florin Ciocan: European Institute of Business Administration (INSEAD), 77305 Fontainebleau, France
Velibor V. Mišić: Anderson School of Management, University of California, Los Angeles, Los Angeles, California 90095

Management Science, 2022, vol. 68, issue 3, 1616-1638

Abstract: Optimal stopping is the problem of deciding when to stop a stochastic system to obtain the greatest reward, arising in numerous application areas such as finance, healthcare, and marketing. State-of-the-art methods for high-dimensional optimal stopping involve approximating the value function or the continuation value and then using that approximation within a greedy policy. Although such policies can perform very well, they are generally not guaranteed to be interpretable; that is, a decision maker may not be able to easily see the link between the current system state and the policy’s action. In this paper, we propose a new approach to optimal stopping wherein the policy is represented as a binary tree, in the spirit of naturally interpretable tree models commonly used in machine learning. We show that the class of tree policies is rich enough to approximate the optimal policy. We formulate the problem of learning such policies from observed trajectories of the stochastic system as a sample average approximation (SAA) problem. We prove that the SAA problem converges under mild conditions as the sample size increases but that, computationally, even immediate simplifications of the SAA problem are theoretically intractable. We thus propose a tractable heuristic for approximately solving the SAA problem by greedily constructing the tree from the top down. We demonstrate the value of our approach by applying it to the canonical problem of option pricing, using both synthetic instances and instances using real Standard & Poor’s 500 Index data. Our method obtains policies that (1) outperform state-of-the-art noninterpretable methods, based on simulation regression and martingale duality, and (2) possess a remarkably simple and intuitive structure.

Keywords: optimal stopping; approximate dynamic programming; interpretability; decision trees; option pricing (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (5)

Downloads: (external link)
http://dx.doi.org/10.1287/mnsc.2020.3592 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:ormnsc:v:68:y:2022:i:3:p:1616-1638

Access Statistics for this article

More articles in Management Science from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().