HSVI Can Solve Zero-Sum Partially Observable Stochastic Games
Aurélien Delage (),
Olivier Buffet (),
Jilles S. Dibangoye () and
Abdallah Saffidine ()
Additional contact information
Aurélien Delage: Universite de Lyon
Olivier Buffet: Universite de Lorraine
Jilles S. Dibangoye: University of Groningen
Abdallah Saffidine: University of New South Wales
Dynamic Games and Applications, 2024, vol. 14, issue 4, No 2, 805 pages
Abstract:
Abstract State-of-the-art methods for solving 2-player zero-sum imperfect information games rely on linear programming or regret minimization, though not on dynamic programming (DP) or heuristic search (HS), while the latter are often at the core of state-of-the-art solvers for other sequential decision-making problems. In partially observable or collaborative settings (e.g., POMDPs and Dec-POMDPs), DP and HS require introducing an appropriate statistic that induces a fully observable problem as well as bounding (convex) approximators of the optimal value function. This approach has succeeded in some subclasses of 2-player zero-sum partially observable stochastic games (zs-POSGs) as well, but how to apply it in the general case still remains an open question. We answer it by (i) rigorously defining an equivalent game to work with, (ii) proving mathematical properties of the optimal value function that allow deriving bounds that come with solution strategies, (iii) proposing for the first time an HSVI-like solver that provably converges to an $$\epsilon $$ ϵ -optimal solution in finite time, and (iv) empirically analyzing it. This opens the door to a novel family of promising approaches complementing those relying on linear programming or iterative methods.
Keywords: Game theory; zs-POSGs; Multi-agent systems; Heuristic search; Dynamic programming (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s13235-023-00519-6 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:dyngam:v:14:y:2024:i:4:d:10.1007_s13235-023-00519-6
Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/13235
DOI: 10.1007/s13235-023-00519-6
Access Statistics for this article
Dynamic Games and Applications is currently edited by Georges Zaccour
More articles in Dynamic Games and Applications from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().