Markov and semi-Markov decision models and optimal stopping
Manfred Schäl
Additional contact information
Manfred Schäl: Universität Bonn, Institut für Angewandte Mathematik
A chapter in Semi-Markov Models, 1986, pp 39-61 from Springer
Abstract:
Abstract We consider a system with a finite number of states i ϵ S. Periodically we observe the current state of the system, and then choose an action a from a set A of possible actions. As a result of the current state i and the chosen action a, the system moves to a new state j with the probability pij (a). As a further consequence, an immediate reward r(i, a) is earned. If the control process is stopped in a state i, then we obtain a terminal reward u (i). Thus, the underlying model is given by a tupel M = (S,A,p,r,u). (i) S stands for the state space and is assumed to be finite. (ii) A is the action space. (iii) pij (a) are the transition probabilities, where Σjϵs pij (a) = 1 for all i ϵ S, a ϵ A. (iv) r (i,a) is the real valued one-step reward. (v) u (i) is the real valued terminal reward or the utility function.
Keywords: Decision Model; Markov Decision Process; Optimality Equation; Average Reward; Total Reward (search for similar items in EconPapers)
Date: 1986
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:sprchp:978-1-4899-0574-1_4
Ordering information: This item can be ordered from
http://www.springer.com/9781489905741
DOI: 10.1007/978-1-4899-0574-1_4
Access Statistics for this chapter
More chapters in Springer Books from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().