Convergence Analysis of Accelerated Stochastic Gradient Descent Under the Growth Condition
You-Lin Chen (),
Sen Na () and
Mladen Kolar ()
Additional contact information
You-Lin Chen: Department of Statistics, The University of Chicago, Chicago, Illinois 60637
Sen Na: International Computer Science Institute, Berkeley, California 94704; Department of Statistics, University of California, Berkeley, California 94720
Mladen Kolar: Booth School of Business, The University of Chicago, Chicago, Illinois 60637
Mathematics of Operations Research, 2024, vol. 49, issue 4, 2492-2526
Abstract:
We study the convergence of accelerated stochastic gradient descent (SGD) for strongly convex objectives under the growth condition , which states that the variance of stochastic gradient is bounded by a multiplicative part that grows with the full gradient and a constant additive part. Through the lens of the growth condition, we investigate four widely used accelerated methods: Nesterov’s accelerated method (NAM), robust momentum method (RMM), accelerated dual averaging method (DAM+), and implicit DAM+ (iDAM+). Although these methods are known to improve the convergence rate of SGD under the condition that the stochastic gradient has bounded variance, it is not well understood how their convergence rates are affected by the multiplicative noise. In this paper, we show that these methods all converge to a neighborhood of the optimum with accelerated convergence rates (compared with SGD), even under the growth condition. In particular, NAM, RMM, and iDAM+ enjoy acceleration only with a mild multiplicative noise, whereas DAM+ enjoys acceleration, even with a large multiplicative noise. Furthermore, we propose a generic tail-averaged scheme that allows the accelerated rates of DAM+ and iDAM+ to nearly attain the theoretical lower bound (up to a logarithmic factor in the variance term). We conduct numerical experiments to support our theoretical conclusions.
Keywords: Primary: 90-08; 90C06; 90C15; 90C25; 90C90; accelerated SGD; stochastic approximation; strongly convex optimization; growth condition (search for similar items in EconPapers)
Date: 2024
References: Add references at CitEc
Citations:
Downloads: (external link)
http://dx.doi.org/10.1287/moor.2021.0293 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:ormoor:v:49:y:2024:i:4:p:2492-2526
Access Statistics for this article
More articles in Mathematics of Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().