The finiteness of the reward function and the optimal value function in Markov decision processes
Qiying Hu and
Chen Xu
Mathematical Methods of Operations Research, 1999, vol. 49, issue 2, 255-266
Abstract:
This paper studies the discrete time Markov decision processes (MDP) with expected discounted total reward, where the state space is countable, the action space is measurable, the reward function is extended real-valued, and the discount rate may be any real number. Two conditions (GC) and (C) are presented, which are weaker than that presented in literature. By eliminating some worst actions, the state space S can be partitioned into sets S ∞ , S −∞ , S 0 , on which the optimal value function equals +∞ , −∞ or is finite, respectively. Furthermore, the validity of the optimality equation is shown when its right hand side is well defined, especially, when it is restricted to the subset S 0 . The reward function r (i, a) is finite and bounded above in a for each i∈S 0 . Finally, some sufficient conditions for (GC) and (C) are given. Copyright Springer-Verlag Berlin Heidelberg 1999
Keywords: Key words: Markov decision processes; expected discounted total rewards; optimality equation; decomposing the state space; eliminating actions. (search for similar items in EconPapers)
Date: 1999
References: Add references at CitEc
Citations:
Downloads: (external link)
http://hdl.handle.net/10.1007/PL00020916 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:mathme:v:49:y:1999:i:2:p:255-266
Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/00186
DOI: 10.1007/PL00020916
Access Statistics for this article
Mathematical Methods of Operations Research is currently edited by Oliver Stein
More articles in Mathematical Methods of Operations Research from Springer, Gesellschaft für Operations Research (GOR), Nederlands Genootschap voor Besliskunde (NGB)
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().