Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation

Jentzen, Arnulf; Welti, Timo

Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation

Arnulf Jentzen and Timo Welti

Applied Mathematics and Computation, 2023, vol. 455, issue C

Abstract: In spite of the accomplishments of deep learning based algorithms in numerous applications and very broad corresponding research interest, at the moment there is still no rigorous understanding of the reasons why such algorithms produce useful results in certain situations. A thorough mathematical analysis of deep learning based algorithms seems to be crucial in order to improve our understanding and to make their implementation more effective and efficient. In this article we provide a mathematically rigorous full error analysis of deep learning based empirical risk minimisation with quadratic loss function in the probabilistically strong sense, where the underlying deep neural networks are trained using stochastic gradient descent with random initialisation. The convergence speed we obtain suffers under the curse of dimensionality. However, it is presumably close to optimal in the generality of the framework we consider and, to the best of our knowledge, we establish the first full error analysis in the scientific literature for a deep learning based algorithm in the probabilistically strong sense as well as the first full error analysis in the scientific literature for a deep learning based algorithm where stochastic gradient descent with random initialisation is the employed optimisation method.

Keywords: Deep learning; Deep neural networks; Empirical risk minimisation; Full error analysis; Approximation; Generalisation; Optimisation; Strong convergence; Stochastic gradient descent; Random initialisation (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0096300323000760
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:apmaco:v:455:y:2023:i:c:s0096300323000760

DOI: 10.1016/j.amc.2023.127907

Access Statistics for this article

Applied Mathematics and Computation is currently edited by Theodore Simos

More articles in Applied Mathematics and Computation from Elsevier
Bibliographic data for series maintained by Catherine Liu ().