Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm
E.F. Arruda and
M.D. Fragoso
European Journal of Operational Research, 2015, vol. 240, issue 3, 697-705
Abstract:
This paper introduces a two-phase approach to solve average cost Markov decision processes, which is based on state space embedding or time aggregation. In the first phase, time aggregation is applied for policy optimization in a prescribed subset of the state space, and a novel result is applied to expand the evaluation to the whole state space. This evaluation is then used in the second phase in a policy improvement step, and the two phases are then alternated until convergence is attained. Some numerical experiments illustrate the results.
Keywords: Dynamic programming; Markov decision processes; Embedding; Time aggregation; Stochastic optimal control (search for similar items in EconPapers)
Date: 2015
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0377221714006584
Full text for ScienceDirect subscribers only
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:ejores:v:240:y:2015:i:3:p:697-705
DOI: 10.1016/j.ejor.2014.08.023
Access Statistics for this article
European Journal of Operational Research is currently edited by Roman Slowinski, Jesus Artalejo, Jean-Charles. Billaut, Robert Dyson and Lorenzo Peccati
More articles in European Journal of Operational Research from Elsevier
Bibliographic data for series maintained by Catherine Liu ().