Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation
Arnulf Jentzen and
Timo Welti
Applied Mathematics and Computation, 2023, vol. 455, issue C
Abstract:
In spite of the accomplishments of deep learning based algorithms in numerous applications and very broad corresponding research interest, at the moment there is still no rigorous understanding of the reasons why such algorithms produce useful results in certain situations. A thorough mathematical analysis of deep learning based algorithms seems to be crucial in order to improve our understanding and to make their implementation more effective and efficient. In this article we provide a mathematically rigorous full error analysis of deep learning based empirical risk minimisation with quadratic loss function in the probabilistically strong sense, where the underlying deep neural networks are trained using stochastic gradient descent with random initialisation. The convergence speed we obtain suffers under the curse of dimensionality. However, it is presumably close to optimal in the generality of the framework we consider and, to the best of our knowledge, we establish the first full error analysis in the scientific literature for a deep learning based algorithm in the probabilistically strong sense as well as the first full error analysis in the scientific literature for a deep learning based algorithm where stochastic gradient descent with random initialisation is the employed optimisation method.
Keywords: Deep learning; Deep neural networks; Empirical risk minimisation; Full error analysis; Approximation; Generalisation; Optimisation; Strong convergence; Stochastic gradient descent; Random initialisation (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0096300323000760
Full text for ScienceDirect subscribers only
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:apmaco:v:455:y:2023:i:c:s0096300323000760
DOI: 10.1016/j.amc.2023.127907
Access Statistics for this article
Applied Mathematics and Computation is currently edited by Theodore Simos
More articles in Applied Mathematics and Computation from Elsevier
Bibliographic data for series maintained by Catherine Liu ().