Policy gradient learning methods for stochastic control with exit time and applications to share repurchase pricing

Hamdouche, Mohamed; Henry-Labordere, Pierre; Pham, Huyen

Policy gradient learning methods for stochastic control with exit time and applications to share repurchase pricing

Mohamed Hamdouche, Pierre Henry-Labordere and Huyen Pham

Abstract: We develop policy gradients methods for stochastic control with exit time in a model-free setting. We propose two types of algorithms for learning either directly the optimal policy or by learning alternately the value function (critic) and the optimal control (actor). The use of randomized policies is crucial for overcoming notably the issue related to the exit time in the gradient computation. We demonstrate the effectiveness of our approach by implementing our numerical schemes in the application to the problem of share repurchase pricing. Our results show that the proposed policy gradient methods outperform PDE or other neural networks techniques in a model-based setting. Furthermore, our algorithms are flexible enough to incorporate realistic market conditions like e.g. price impact or transaction costs.

Date: 2023-02
New Economics Papers: this item is included in nep-big
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://arxiv.org/pdf/2302.07320 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2302.07320

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().