Computing semi-stationary optimal policies for multichain semi-Markov decision processes

Mondal, Prasenjit

Computing semi-stationary optimal policies for multichain semi-Markov decision processes

Prasenjit Mondal ()
Additional contact information
Prasenjit Mondal: Government General Degree College

Annals of Operations Research, 2020, vol. 287, issue 2, No 16, 843-865

Abstract: Abstract We consider semi-Markov decision processes with finite state and action spaces and a general multichain structure. A form of limiting ratio average (undiscounted) reward is the criterion for comparing different policies. The main result is that the value vector and a pure optimal semi-stationary policy (i.e., a policy which depends only on the initial state and the current state) for such an SMDP can be computed directly from an optimal solution of a finite set (whose cardinality equals the number of states) of linear programming (LP) problems. To be more precise, we prove that the single LP associated with a fixed initial state provides the value and an optimal pure stationary policy of the corresponding SMDP. The relation between the set of feasible solutions of each LP and the set of stationary policies is also analyzed. Examples are worked out to describe the algorithm.

Keywords: Semi-Markov decision processes; Limiting ratio average reward; Multichain structure; Pure optimal semi-stationary policies; Linear programming; 60K15; 60K20 (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://link.springer.com/10.1007/s10479-017-2686-x Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:annopr:v:287:y:2020:i:2:d:10.1007_s10479-017-2686-x

Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10479

DOI: 10.1007/s10479-017-2686-x

Access Statistics for this article

Annals of Operations Research is currently edited by Endre Boros

More articles in Annals of Operations Research from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().