A primal–dual policy iteration algorithm for constrained Markov decision processes
Zeyu Liu,
Xueping Li and
Anahita Khojandi
European Journal of Operational Research, 2026, vol. 328, issue 1, 174-188
Abstract:
The solution algorithms of Constrained Markov Decision Process (CMDP), a widely adopted model for sequential decision-making, have been intensively studied in the literature. Despite increasing effort, the Linear Programming (LP) formulation of CMDP remains the dominant exact method that leads to the optimal solution without constraint violations. However, the LP formulation is computationally inefficient due to the curse of dimensionality in CMDP state and action spaces. In this study, we introduce a novel policy iteration method for CMDP, based on decomposition and row-generation techniques. We design a Primal–Dual Policy Iteration (PDPI) algorithm that utilizes state values and Lagrangian multipliers to improve randomized stationary policies in an iterative fashion. We analytically show that upon convergence, PDPI produces the optimal solution for CMDP. An upper bound of the convergence iterations is also given. To validate the algorithm performance, we conduct comprehensive computational experiments on six benchmarking problems curated from the literature. Results show that PDPI outperforms conventional methods considerably, improving the total algorithm runtime by up to 89.19%. The improvement becomes more significant as the problem size grows larger. We further provide insights and discuss the impact of the developed method.
Keywords: Markov processes; Decomposition; Policy iteration; Row generation (search for similar items in EconPapers)
Date: 2026
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0377221725006757
Full text for ScienceDirect subscribers only
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:ejores:v:328:y:2026:i:1:p:174-188
DOI: 10.1016/j.ejor.2025.08.038
Access Statistics for this article
European Journal of Operational Research is currently edited by Roman Slowinski, Jesus Artalejo, Jean-Charles. Billaut, Robert Dyson and Lorenzo Peccati
More articles in European Journal of Operational Research from Elsevier
Bibliographic data for series maintained by Catherine Liu ().