Near-optimal no-regret algorithms for zero-sum games
Constantinos Daskalakis,
Alan Deckelbaum and
Anthony Kim
Games and Economic Behavior, 2015, vol. 92, issue C, 327-348
Abstract:
We propose a new no-regret learning algorithm. When used against an adversary, our algorithm achieves average regret that scales optimally as O(1T) with the number T of rounds. However, when our algorithm is used by both players of a zero-sum game, their average regret scales as O(lnTT), guaranteeing a near-linear rate of convergence to the value of the game. This represents an almost-quadratic improvement on the rate of convergence to the value of a zero-sum game known to be achievable by any no-regret learning algorithm. Moreover, it is essentially optimal as we also show a lower bound of Ω(1T) for all distributed dynamics, as long as the players do not know their payoff matrices in the beginning of the dynamics. (If they did, they could privately compute minimax strategies and play them ad infinitum.)
Keywords: Zero-sum games; Repeated games; Learning; No-regret dynamics (search for similar items in EconPapers)
JEL-codes: C72 C73 (search for similar items in EconPapers)
Date: 2015
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0899825614000049
Full text for ScienceDirect subscribers only
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:gamebe:v:92:y:2015:i:c:p:327-348
DOI: 10.1016/j.geb.2014.01.003
Access Statistics for this article
Games and Economic Behavior is currently edited by E. Kalai
More articles in Games and Economic Behavior from Elsevier
Bibliographic data for series maintained by Catherine Liu ().