Finite-Memory Suboptimal Design for Partially Observed Markov Decision Processes
Chelsea C. White and
William T. Scherer
Additional contact information
Chelsea C. White: University of Michigan, Ann Arbor, Michigan
William T. Scherer: University of Virginia, Charlottesville, Virginia
Operations Research, 1994, vol. 42, issue 3, 439-455
Abstract:
We develop bounds on the value function and a suboptimal design for the partially observed Markov decision process. These bounds and suboptimal design are based on the M most recent observations and actions. An a priori measure of the quality of these bounds is given. We show that larger M implies tighter bounds. An operations count analysis indicates that ( # A # Z ) M +1 ( # S ) multiplications and additions are required per successive approximations iteration of the suboptimal design algorithm, where A , Z , and S are the action, observation, and state spaces, respectively, suggesting the algorithm is of potential use for problems with large state spaces. A preliminary numerical study indicates that the quality of the suboptimal design can be excellent.
Keywords: dynamic; programming:; Markov; decision; processes (search for similar items in EconPapers)
Date: 1994
References: Add references at CitEc
Citations: View citations in EconPapers (5)
Downloads: (external link)
http://dx.doi.org/10.1287/opre.42.3.439 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:42:y:1994:i:3:p:439-455
Access Statistics for this article
More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().