Existence of a Stationary Control for a Markov Chain Maximizing the Average Reward
Anders Martin-Löf
Additional contact information
Anders Martin-Löf: The Royal Institute of Technology, Stockholm, Sweden
Operations Research, 1967, vol. 15, issue 5, 866-871
Abstract:
The problem of optimal control of a discrete time stationary Markov chain with complete state information has been considered by many authors. The case with finitely many states and controls has been thoroughly investigated. Chains with infinitely many states or controls have also been considered with various assumptions concerning the reward function. In this paper the existence of a control maximizing the average reward is established for Markov chains with a finite number of states and an arbitrary compact set of possible actions in each state. It is assumed that there is only one ergodic class and no transient states in the chain for every control. The method of proof uses methods from convex programming, and is analogous to the linear programming approach used by Wolfe and Danzig.
Date: 1967
References: Add references at CitEc
Citations:
Downloads: (external link)
http://dx.doi.org/10.1287/opre.15.5.866 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:15:y:1967:i:5:p:866-871
Access Statistics for this article
More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().