Adaptive, Doubly Optimal No-Regret Learning in Strongly Monotone and Exp-Concave Games with Gradient Feedback
Michael Jordan (),
Tianyi Lin () and
Zhengyuan Zhou ()
Additional contact information
Michael Jordan: Department of Electrical Engineering and Computer Science and Department of Statistics, University of California, Berkeley, Berkeley, California 94720
Tianyi Lin: Department of Industrial Engineering and Operations Research, Columbia University, New York, New York 10027
Zhengyuan Zhou: Stern School of Business, New York University, New York, New York 10012
Operations Research, 2025, vol. 73, issue 3, 1675-1702
Abstract:
Online gradient descent (OGD) is well-known to be doubly optimal under strong convexity or monotonicity assumptions: (1) in the single-agent setting, it achieves an optimal regret of Θ ( log T ) for strongly convex cost functions, and (2) in the multiagent setting of strongly monotone games with each agent employing OGD, we obtain last-iterate convergence of the joint action to a unique Nash equilibrium at an optimal rate of Θ ( 1 T ) . Whereas these finite-time guarantees highlight its merits, OGD has the drawback that it requires knowing the strong convexity/monotonicity parameters. In this paper, we design a fully adaptive OGD algorithm, AdaOGD , that does not require a priori knowledge of these parameters. In the single-agent setting, our algorithm achieves O ( log 2 ( T ) ) regret under strong convexity, which is optimal up to a log factor. Further, if each agent employs AdaOGD in strongly monotone games, the joint action converges in a last-iterate sense to a unique Nash equilibrium at a rate of O ( log 3 T T ) , again optimal up to log factors. We illustrate our algorithms in a learning version of the classic newsvendor problem, in which, because of lost sales, only (noisy) gradient feedback can be observed. Our results immediately yield the first feasible and near-optimal algorithm for both the single-retailer and multiretailer settings. We also extend our results to the more general setting of exp-concave cost functions and games, using the online Newton step algorithm.
Keywords: Optimization; no-regret learning; strongly monotone games; noisy gradient feedback; online optimization; Nash equilibrium; optimal regret; optimal last-iterate convergence rate; exp-concave games (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
http://dx.doi.org/10.1287/opre.2022.0446 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:73:y:2025:i:3:p:1675-1702
Access Statistics for this article
More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().