EconPapers    
Economics at your fingertips  
 

Finite-Memory Strategies in POMDPs with Long-Run Average Objectives

Krishnendu Chatterjee (), Raimundo Saona () and Bruno Ziliotto ()
Additional contact information
Krishnendu Chatterjee: Institute of Science and Technology Austria, 3400 Klosterneuburg, Austria
Raimundo Saona: Institute of Science and Technology Austria, 3400 Klosterneuburg, Austria
Bruno Ziliotto: Centre de Recherche en Mathématiques de la Décision, Centre National de la Recherche Scientifique, Université Paris Dauphine, Université PSL, 75016 Paris, France

Mathematics of Operations Research, 2022, vol. 47, issue 1, 100-119

Abstract: Partially observable Markov decision processes (POMDPs) are standard models for dynamic systems with probabilistic and nondeterministic behaviour in uncertain environments. We prove that in POMDPs with long-run average objective, the decision maker has approximately optimal strategies with finite memory. This implies notably that approximating the long-run value is recursively enumerable, as well as a weak continuity property of the value with respect to the transition function.

Keywords: 90C39; 90C40; 37A50; 03D15; finite state; Markov; dynamic programming; computational complexity; analysis of algorithms (search for similar items in EconPapers)
Date: 2022
References: Add references at CitEc
Citations:

Downloads: (external link)
http://dx.doi.org/10.1287/moor.2020.1116 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:ormoor:v:47:y:2022:i:1:p:100-119

Access Statistics for this article

More articles in Mathematics of Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().

 
Page updated 2025-03-19
Handle: RePEc:inm:ormoor:v:47:y:2022:i:1:p:100-119