Asymptotic study of stochastic adaptive algorithm in non-convex landscape
Sébastien Gadat and
Ioana Gavra
Additional contact information
Sébastien Gadat: TSE-R - Toulouse School of Economics - UT Capitole - Université Toulouse Capitole - UT - Université de Toulouse - EHESS - École des hautes études en sciences sociales - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement
Ioana Gavra: IRMAR - Institut de Recherche Mathématique de Rennes - UR - Université de Rennes - INSA Rennes - Institut National des Sciences Appliquées - Rennes - INSA - Institut National des Sciences Appliquées - ENS Rennes - École normale supérieure - Rennes - UR2 - Université de Rennes 2 - CNRS - Centre National de la Recherche Scientifique - Institut Agro Rennes Angers - Institut Agro - Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement
Post-Print from HAL
Abstract:
This paper studies some asymptotic properties of adaptive algorithms widely used in optimization and machine learning, and among them Adagrad and Rmsprop, which are involved in most of the blackbox deep learning algorithms. Our setup is the non-convex landscape optimization point of view, we consider a one time scale parametrization and we consider the situation where these algorithms may be used or not with mini-batches. We adopt the point of view of stochastic algorithms and establish the almost sure convergence of these methods when using a decreasing step-size towards the set of critical points of the target function. With a mild extra assumption on the noise, we also obtain the convergence towards the set of minimizers of the function. Along our study, we also obtain a \convergence rate" of the methods, in the vein of the works of [GL13].
Keywords: Stochastic optimization; Stochastic adaptive algorithm; Convergence of random variables (search for similar items in EconPapers)
Date: 2022-08
New Economics Papers: this item is included in nep-big and nep-cmp
Note: View the original document on HAL open archive server: https://hal.science/hal-03857182v1
References: View references in EconPapers View complete reference list from CitEc
Citations:
Published in Journal of Machine Learning Research, 2022, 23 (228), pp.1-54
Downloads: (external link)
https://hal.science/hal-03857182v1/document (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hal:journl:hal-03857182
Access Statistics for this paper
More papers in Post-Print from HAL
Bibliographic data for series maintained by CCSD ().