EconPapers    
Economics at your fingertips  
 

Robust Control of Markov Decision Processes with Uncertain Transition Matrices

Arnab Nilim () and Laurent El Ghaoui ()
Additional contact information
Arnab Nilim: Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, California 94720
Laurent El Ghaoui: Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, California 94720

Operations Research, 2005, vol. 53, issue 5, 780-798

Abstract: Optimal solutions to Markov decision problems may be very sensitive with respect to the state transition probabilities. In many practical problems, the estimation of these probabilities is far from accurate. Hence, estimation errors are limiting factors in applying Markov decision processes to real-world problems.We consider a robust control problem for a finite-state, finite-action Markov decision process, where uncertainty on the transition matrices is described in terms of possibly nonconvex sets. We show that perfect duality holds for this problem, and that as a consequence, it can be solved with a variant of the classical dynamic programming algorithm, the “robust dynamic programming” algorithm. We show that a particular choice of the uncertainty sets, involving likelihood regions or entropy bounds, leads to both a statistically accurate representation of uncertainty, and a complexity of the robust recursion that is almost the same as that of the classical recursion. Hence, robustness can be added at practically no extra computing cost. We derive similar results for other uncertainty sets, including one with a finite number of possible values for the transition matrices.We describe in a practical path planning example the benefits of using a robust strategy instead of the classical optimal strategy; even if the uncertainty level is only crudely guessed, the robust strategy yields a much better worst-case expected travel time.

Keywords: dynamic programming: Markov; finite state; game theory; programming: convex; uncertainty; robustness; statistics: estimation (search for similar items in EconPapers)
Date: 2005
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (61)

Downloads: (external link)
http://dx.doi.org/10.1287/opre.1050.0216 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:53:y:2005:i:5:p:780-798

Access Statistics for this article

More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().

 
Page updated 2025-03-19
Handle: RePEc:inm:oropre:v:53:y:2005:i:5:p:780-798