Pontryagin-Guided Deep Policy Learning for Constrained Dynamic Portfolio Choice
Jeonggyu Huh,
Jaegi Jeon,
Hyeng Keun Koo and
Byung Hwa Lim
Papers from arXiv.org
Abstract:
We present a Pontryagin-Guided Direct Policy Optimization (PG-DPO) framework for \emph{constrained} continuous-time portfolio--consumption problems that scales to hundreds of assets. The method couples neural policies with Pontryagin's Maximum Principle and enforces feasibility via a lightweight log-barrier stagewise solve; a \emph{manifold-projection} variant (P--PGDPO) projects controls onto the PMP/KKT manifold using stabilized adjoints. We prove a barrier--KKT correspondence with $O(\epsilon)$ policy error and $O(\epsilon^2)$ instantaneous Hamiltonian gap, and extend the BPTT--PMP match to constrained settings. On short-sale (nonnegativity, floating cash) and wealth-proportional consumption caps, P--PGDPO reduces risky-weight errors by orders of magnitude versus baseline PG-DPO, while the one-dimensional consumption control shows smaller but consistent gains near binding caps. The approach remains effective when closed-form benchmarks are unavailable, and is readily extensible to transaction costs and interacting limits -- promising even greater benefits under time-varying investment opportunities where classical solutions are scarce.
Date: 2025-01, Revised 2025-09
References: Add references at CitEc
Citations:
Downloads: (external link)
http://arxiv.org/pdf/2501.12600 Latest version (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2501.12600
Access Statistics for this paper
More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().