Risk-Sensitive Markov Decision Processes
Ronald A. Howard and
James E. Matheson
Additional contact information
Ronald A. Howard: Stanford University
James E. Matheson: Stanford Research Institute
Management Science, 1972, vol. 18, issue 7, 356-369
Abstract:
This paper considers the maximization of certain equivalent reward generated by a Markov decision process with constant risk sensitivity. First, value iteration is used to optimize possibly time-varying processes of finite duration. Then a policy iteration procedure is developed to find the stationary policy with highest certain equivalent gain for the infinite duration case. A simple example demonstrates both procedures.
Date: 1972
References: Add references at CitEc
Citations: View citations in EconPapers (50)
Downloads: (external link)
http://dx.doi.org/10.1287/mnsc.18.7.356 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:ormnsc:v:18:y:1972:i:7:p:356-369
Access Statistics for this article
More articles in Management Science from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().