Decomposable Markov Decision Processes: A Fluid Optimization Approach
Dimitris Bertsimas () and
Velibor V. Mišić ()
Additional contact information
Dimitris Bertsimas: Sloan School of Management and Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
Velibor V. Mišić: Anderson School of Management, University of California, Los Angeles, Los Angeles, California 90095
Operations Research, 2016, vol. 64, issue 6, 1537-1555
Abstract:
Decomposable Markov decision processes (MDPs) are problems where the stochastic system can be decomposed into multiple individual components. Although such MDPs arise naturally in many practical applications, they are often difficult to solve exactly due to the enormous size of the state space of the complete system, which grows exponentially with the number of components. In this paper, we propose an approximate solution approach to decomposable MDPs that is based on re-solving a fluid linear optimization formulation of the problem at each decision epoch. This formulation tractably approximates the problem by modeling transition behavior at the level of the individual components rather than the complete system. We prove that our fluid formulation provides a tighter bound on the optimal value function than three state-of-the-art formulations: the approximate linear optimization formulation, the classical Lagrangian relaxation formulation, and a novel, alternate Lagrangian relaxation that is based on relaxing an action consistency constraint. We provide a numerical demonstration of the effectiveness of the approach in the area of multiarmed bandit problems, where we show that our approach provides near optimal performance and outperforms state-of-the-art algorithms.
Keywords: dynamic programming; optimal control; probability; Markov processes; programming; linear; applications (search for similar items in EconPapers)
Date: 2016
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (3)
Downloads: (external link)
http://dx.doi.org/10.1287/opre.2016.1531 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:64:y:2016:i:6:p:1537-1555
Access Statistics for this article
More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().