Solution Procedures for Partially Observed Markov Decision Processes

White, Chelsea C.; Scherer, William T.

Solution Procedures for Partially Observed Markov Decision Processes

Chelsea C. White and William T. Scherer
Additional contact information
Chelsea C. White: University of Virginia, Charlottesville, Virginia
William T. Scherer: University of Virginia, Charlottesville, Virginia

Operations Research, 1989, vol. 37, issue 5, 791-797

Abstract: We present three algorithms to solve the infinite horizon, expected discounted total reward partially observed Markov decision process (POMDP). Each algorithm integrates a successive approximations algorithm for the POMDP due to A. Smallwood and E. Sondik with an appropriately generalized numerical technique that has been shown to reduce CPU time until convergence for the completely observed case. The first technique is reward revision. The second technique is reward revision integrated with modified policy iteration. The third is a standard extrapolation. A numerical study indicates the potentially significant computational value of these algorithms.

Keywords: dynamic programming: Markov; finite state (search for similar items in EconPapers)
Date: 1989
References: Add references at CitEc
Citations: View citations in EconPapers (10)

Downloads: (external link)
http://dx.doi.org/10.1287/opre.37.5.791 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:37:y:1989:i:5:p:791-797

Access Statistics for this article

More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().