Mean Variance Optimality Criteria for Discounted Markov Decision Process
Satia J K
IIMA Working Papers from Indian Institute of Management Ahmedabad, Research and Publication Department
Abstract:
The criteria of maximizing expected rewards has been widely used in Markov decision processes following Howard [2]. Recently considerations related to higher moments of rewards have also been incorporated by Jaquette [4] and Goldwerger [1]. This paper considers mean variance criteria for discounted Markov decision processes. Variability in rewards arising both out of variability of rewards during each period and due to stochastic nature of transitions is considered. It is shown that randomized policies need not be considered when a function of mean and variance ( - a) is to be optimized. However an example illustrates that policies which will simultaneously minimize variances for all states may not exist. We, therefore, provide a dynamic programming formulation for optimizing i - ai for each state i. An example is given to illustrate the procedure.
Date: 1978-09-01
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:iim:iimawp:wp00322
Access Statistics for this paper
More papers in IIMA Working Papers from Indian Institute of Management Ahmedabad, Research and Publication Department Contact information at EDIRC.
Bibliographic data for series maintained by ().