EconPapers    
Economics at your fingertips  
 

Computationally Feasible Bounds for Partially Observed Markov Decision Processes

William S. Lovejoy
Additional contact information
William S. Lovejoy: Stanford University, Stanford, California

Operations Research, 1991, vol. 39, issue 1, 162-175

Abstract: A partially observed Markov decision process (POMDP) is a sequential decision problem where information concerning parameters of interest is incomplete, and possible actions include sampling, surveying, or otherwise collecting additional information. Such problems can theoretically be solved as dynamic programs, but the relevant state space is infinite, which inhibits algorithmic solution. This paper explains how to approximate the state space by a finite grid of points, and use that grid to construct upper and lower value function bounds, generate approximate nonstationary and stationary policies, and bound the value loss relative to optimal for using these policies in the decision problem. A numerical example illustrates the methodology.

Keywords: dynamic programming: partially observed Markov decision processes; dynamic programming; Markov: Bayesian programming and infinite state Markov models (search for similar items in EconPapers)
Date: 1991
References: Add references at CitEc
Citations: View citations in EconPapers (22)

Downloads: (external link)
http://dx.doi.org/10.1287/opre.39.1.162 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:39:y:1991:i:1:p:162-175

Access Statistics for this article

More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().

 
Page updated 2025-03-19
Handle: RePEc:inm:oropre:v:39:y:1991:i:1:p:162-175