A basic formula for performance gradient estimation of semi-Markov decision processes

Li, Yanjie; Cao, Fang

A basic formula for performance gradient estimation of semi-Markov decision processes

Yanjie Li and Fang Cao

European Journal of Operational Research, 2013, vol. 224, issue 2, 333-339

Abstract: This paper presents a basic formula for performance gradient estimation of semi-Markov decision processes (SMDPs) under average-reward criterion. This formula directly follows from a sensitivity equation in perturbation analysis. With this formula, we develop three sample-path-based gradient estimation algorithms by using a single sample path. These algorithms naturally extend many gradient estimation algorithms for discrete-time Markov systems to continuous time semi-Markov models. In particular, they require less storage than the algorithm in the literature.

Keywords: Markov processes; Semi-Markov decision processes; Sample-path-based gradient estimation; Perturbation analysis (search for similar items in EconPapers)
Date: 2013
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (4)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0377221712006108
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:ejores:v:224:y:2013:i:2:p:333-339

DOI: 10.1016/j.ejor.2012.08.010

Access Statistics for this article

European Journal of Operational Research is currently edited by Roman Slowinski, Jesus Artalejo, Jean-Charles. Billaut, Robert Dyson and Lorenzo Peccati

More articles in European Journal of Operational Research from Elsevier
Bibliographic data for series maintained by Catherine Liu ().