Robust Markov Decision Processes
Wolfram Wiesemann,
Daniel Kuhn and
Berç Rustem
No 34, Working Papers from COMISEF
Abstract:
Markov decision processes (MDPs) are powerful tools for decision making in uncertain dynamic environments. However, the solutions of MDPs are of limited practical use due to their sensitivity to distributional model parameters, which are typically unknown and have to be estimated by the decision maker. To counter the detrimental effects of estimation errors, we consider robust MDPs that offer probabilistic guarantees in view of the unknown parameters. To this end, we assume that an observation history of the MDP is available. Based on this history, we derive a confidence region that contains the unknown parameters with a pre-specified probability 1-ß. Afterwards, we determine a policy that attains the highest worst-case performance over this confidence region. By construction, this policy achieves or exceeds its worst-case performance with a confidence of at least 1 - ß. Our method involves the solution of tractable conic programs of moderate size.
Pages: 47 pages
Date: 2010-05-05
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://comisef.eu/files/wps034.pdf (application/pdf)
Our link check indicates that this URL is bad, the error code is: 500 Can't connect to comisef.eu:80 (A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:com:wpaper:034
Access Statistics for this paper
More papers in Working Papers from COMISEF
Bibliographic data for series maintained by Anil Khuman ().