An Approximate Dynamic Programming Approach to Repeated Games with Vector Losses
Vijay Kamble (),
Patrick Loiseau () and
Jean Walrand ()
Additional contact information
Vijay Kamble: Department of Information and Decision Sciences, University of Illinois, Chicago, Illinois 60607
Patrick Loiseau: University Grenoble Alpes, INRIA, Centre National de la Recherche Scientifique, Institut Polytechnique de Grenoble, Laboratoire d’Informatique de Grenoble, 38058 Grenoble, France; Max-Planck Institute for Software Systems, D-66123 Saarbrücken, Germany
Jean Walrand: Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, California 94720
Operations Research, 2024, vol. 72, issue 1, 373-388
Abstract:
We describe an approximate dynamic programming (ADP) approach to compute approximations of the optimal strategies and of the minimal losses that can be guaranteed in discounted repeated games with vector-valued losses. Among other applications, such vector-valued games prominently arise in the analysis of worst-case regret in repeated decision making in unknown environments, also known as the adversarial online learning framework. At the core of our approach is a characterization of the lower Pareto frontier of the set of expected losses that a player can guarantee in these games as the unique fixed point of a set-valued dynamic programming operator. When applied to the problem of worst-case regret minimization with discounted losses, our approach yields algorithms that achieve markedly improved performance bounds compared with off-the-shelf online learning algorithms like Hedge. These results thus suggest the significant potential of ADP-based approaches in adversarial online learning.
Keywords: Optimization; vector repeated games; online learning; approximate dynamic programming (search for similar items in EconPapers)
Date: 2024
References: Add references at CitEc
Citations:
Downloads: (external link)
http://dx.doi.org/10.1287/opre.2022.2334 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:72:y:2024:i:1:p:373-388
Access Statistics for this article
More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().