Markov Decision Problems Where Means Bound Variances
Alessandro Arlotto (),
Noah Gans () and
J. Michael Steele ()
Additional contact information
Alessandro Arlotto: The Fuqua School of Business, Duke University, Durham, North Carolina, 27708
Noah Gans: Operations and Information Management Department, The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania, 19104
J. Michael Steele: Statistics Department, The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania, 19104
Operations Research, 2014, vol. 62, issue 4, 864-875
Abstract:
We identify a rich class of finite-horizon Markov decision problems (MDPs) for which the variance of the optimal total reward can be bounded by a simple linear function of its expected value. The class is characterized by three natural properties: reward nonnegativity and boundedness , existence of a do-nothing action , and optimal action monotonicity . These properties are commonly present and typically easy to check. Implications of the class properties and of the variance bound are illustrated by examples of MDPs from operations research, operations management, financial engineering, and combinatorial optimization.
Keywords: Markov decision problems; variance bounds; optimal total reward (search for similar items in EconPapers)
Date: 2014
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (6)
Downloads: (external link)
http://dx.doi.org/10.1287/opre.2014.1281 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:62:y:2014:i:4:p:864-875
Access Statistics for this article
More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().