Strong Uniform Value in Gambling Houses and Partially Observable Markov Decision Processes
Xavier Venel () and
Bruno Ziliotto
Additional contact information
Xavier Venel: CES - Centre d'économie de la Sorbonne - UP1 - Université Paris 1 Panthéon-Sorbonne - CNRS - Centre National de la Recherche Scientifique, PSE - Paris School of Economics - UP1 - Université Paris 1 Panthéon-Sorbonne - ENS-PSL - École normale supérieure - Paris - PSL - Université Paris Sciences et Lettres - EHESS - École des hautes études en sciences sociales - ENPC - École nationale des ponts et chaussées - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement
Bruno Ziliotto: CEREMADE - CEntre de REcherches en MAthématiques de la DEcision - Université Paris Dauphine-PSL - PSL - Université Paris Sciences et Lettres - CNRS - Centre National de la Recherche Scientifique, Université Paris Dauphine-PSL - PSL - Université Paris Sciences et Lettres
Post-Print from HAL
Abstract:
In several standard models of dynamic programming (gambling houses, MDPs, POMDPs), we prove the existence of a robust notion of value for the infinitely repeated problem, namely the strong uniform value. This solves two open problems. First, this shows that for any > 0, the decision-maker has a pure strategy σ which is-optimal in any n-stage problem, provided that n is big enough (this result was only known for behavior strategies, that is, strategies which use randomization). Second, for any > 0, the decision-maker can guarantee the limit of the n-stage value minus in the infinite problem where the payoff is the expectation of the inferior limit of the time average payoff.
Keywords: Partial Observation; Markov decision processes; Dynamic programming; Long-run average payoff; Uniform value (search for similar items in EconPapers)
Date: 2016
Note: View the original document on HAL open archive server: https://hal.science/hal-01395429v1
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Published in SIAM Journal on Control and Optimization, 2016, 54 (4), pp.1983-2008. ⟨10.1137/15M1043340⟩
Downloads: (external link)
https://hal.science/hal-01395429v1/document (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hal:journl:hal-01395429
DOI: 10.1137/15M1043340
Access Statistics for this paper
More papers in Post-Print from HAL
Bibliographic data for series maintained by CCSD ().