Semi-Markov Decision Processes with Unbounded Rewards
Steven A. Lippman
Additional contact information
Steven A. Lippman: University of California, Los Angeles
Management Science, 1973, vol. 19, issue 7, 717-731
Abstract:
We consider a semi-Markov decision process with arbitrary action space; the state space is the nonnegative integers. As in queueing systems, we assume that {0, 1, 2, ..., n + N} is the set of states accessible from state n in one transition, where N is finite and independent of n. The novel feature of this model is that the one-period reward is not required to be uniformly bounded; instead, we merely assume it to be bounded by a polynomial in n. Our main concern is with the average cost problem. A set of conditions sufficient for there to be an optimal stationary policy which can be obtained from the usual functional equation is developed. These conditions are quite weak and, as illustrated in several queueing examples, are easily verified.
Date: 1973
References: Add references at CitEc
Citations:
Downloads: (external link)
http://dx.doi.org/10.1287/mnsc.19.7.717 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:ormnsc:v:19:y:1973:i:7:p:717-731
Access Statistics for this article
More articles in Management Science from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().