Stochastic Steffensen method
Minda Zhao (),
Zehua Lai () and
Lek-Heng Lim ()
Additional contact information
Minda Zhao: Georgia Institute of Technology
Zehua Lai: University of Texas
Lek-Heng Lim: University of Chicago
Computational Optimization and Applications, 2024, vol. 89, issue 1, No 1, 32 pages
Abstract:
Abstract Is it possible for a first-order method, i.e., only first derivatives allowed, to be quadratically convergent? For univariate loss functions, the answer is yes—the Steffensen method avoids second derivatives and is still quadratically convergent like Newton method. By incorporating a specific step size we can even push its convergence order beyond quadratic to $$1+\sqrt{2} \approx 2.414$$ 1 + 2 ≈ 2.414 . While such high convergence orders are a pointless overkill for a deterministic algorithm, they become rewarding when the algorithm is randomized for problems of massive sizes, as randomization invariably compromises convergence speed. We will introduce two adaptive learning rates inspired by the Steffensen method, intended for use in a stochastic optimization setting and requires no hyperparameter tuning aside from batch size. Extensive experiments show that they compare favorably with several existing first-order methods. When restricted to a quadratic objective, our stochastic Steffensen methods reduce to randomized Kaczmarz method—note that this is not true for SGD or SLBFGS—and thus we may also view our methods as a generalization of randomized Kaczmarz to arbitrary objectives.
Keywords: Steffensen method; Barzilai–Borwein; Quasi-Newton; Stochastic gradient descent; 65K10; 65B05; 65C05; 68W20 (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s10589-024-00583-7 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:coopap:v:89:y:2024:i:1:d:10.1007_s10589-024-00583-7
Ordering information: This journal article can be ordered from
http://www.springer.com/math/journal/10589
DOI: 10.1007/s10589-024-00583-7
Access Statistics for this article
Computational Optimization and Applications is currently edited by William W. Hager
More articles in Computational Optimization and Applications from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().