Constrained Markov Decision Chains
Cyrus Derman and
Arthur F. Veinott, Jr.
Additional contact information
Cyrus Derman: Columbia University
Arthur F. Veinott, Jr.: Stanford University
Management Science, 1972, vol. 19, issue 4-Part-1, 389-390
Abstract:
We consider finite state and action discrete time parameter Markov decision chains. The objective is to provide an algorithm for finding a policy that minimizes the long-run expected average cost when there are linear side conditions on the limit points of the expected state-action frequencies. This problem has been solved previously only for the case where every deterministic stationary policy has at most one ergodic class. This note removes that restriction by applying the Dantzig-Wolfe decomposition principle.
Date: 1972
References: Add references at CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
http://dx.doi.org/10.1287/mnsc.19.4.389 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:ormnsc:v:19:y:1972:i:4-part-1:p:389-390
Access Statistics for this article
More articles in Management Science from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().