Finite-Memory Suboptimal Design for Partially Observed Markov Decision Processes

White, Chelsea C.; Scherer, William T.

Finite-Memory Suboptimal Design for Partially Observed Markov Decision Processes

Chelsea C. White and William T. Scherer
Additional contact information
Chelsea C. White: University of Michigan, Ann Arbor, Michigan
William T. Scherer: University of Virginia, Charlottesville, Virginia

Operations Research, 1994, vol. 42, issue 3, 439-455

Abstract: We develop bounds on the value function and a suboptimal design for the partially observed Markov decision process. These bounds and suboptimal design are based on the M most recent observations and actions. An a priori measure of the quality of these bounds is given. We show that larger M implies tighter bounds. An operations count analysis indicates that ( # A # Z ) M +1 ( # S ) multiplications and additions are required per successive approximations iteration of the suboptimal design algorithm, where A , Z , and S are the action, observation, and state spaces, respectively, suggesting the algorithm is of potential use for problems with large state spaces. A preliminary numerical study indicates that the quality of the suboptimal design can be excellent.

Keywords: dynamic; programming:; Markov; decision; processes (search for similar items in EconPapers)
Date: 1994
References: Add references at CitEc
Citations: View citations in EconPapers (5)

Downloads: (external link)
http://dx.doi.org/10.1287/opre.42.3.439 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:42:y:1994:i:3:p:439-455

Access Statistics for this article

More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().