A Weighted Markov Decision Process
Dmitry Krass,
Jerzy A. Filar and
Sagnik S. Sinha
Additional contact information
Dmitry Krass: University of Toronto, Toronto, Ontario, Canada
Jerzy A. Filar: University of Maryland at Baltimore County, Baltimore, Maryland
Sagnik S. Sinha: Indian Statistical Institute, New Delhi, India
Operations Research, 1992, vol. 40, issue 6, 1180-1187
Abstract:
The two most commonly considered reward criteria for Markov decision processes are the discounted reward and the long-term average reward. The first tends to “neglect” the future, concentrating on the short-term rewards, while the second one tends to do the opposite. We consider a new reward criterion consisting of the weighted combination of these two criteria, thereby allowing the decision maker to place more or less emphasis on the short-term versus the long-term rewards by varying their weights. The mathematical implications of the new criterion include: the deterministic stationary policies can be outperformed by the randomized stationary policies, which in turn can be outperformed by the nonstationary policies; an optimal policy might not exist. We present an iterative algorithm for computing an ε-optimal nonstationary policy with a very simple structure.
Keywords: decision analysis; sequentialL: tradeoffs between discounted and long-term average objectives; dynamic programming; Markov; finite state: new reward criteria for Markov decision processes (search for similar items in EconPapers)
Date: 1992
References: Add references at CitEc
Citations: View citations in EconPapers (3)
Downloads: (external link)
http://dx.doi.org/10.1287/opre.40.6.1180 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:40:y:1992:i:6:p:1180-1187
Access Statistics for this article
More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().