EconPapers    
Economics at your fingertips  
 

Block layer decomposition schemes for training deep neural networks

Laura Palagi and Ruggiero Seccia ()
Additional contact information
Ruggiero Seccia: Sapienza Univerity of Rome

Journal of Global Optimization, 2020, vol. 77, issue 1, No 6, 97-124

Abstract: Abstract Deep feedforward neural networks’ (DFNNs) weight estimation relies on the solution of a very large nonconvex optimization problem that may have many local (no global) minimizers, saddle points and large plateaus. Furthermore, the time needed to find good solutions of the training problem heavily depends on both the number of samples and the number of weights (variables). In this work, we show how block coordinate descent (BCD) methods can be fruitful applied to DFNN weight optimization problem and embedded in online frameworks possibly avoiding bad stationary points. We first describe a batch BCD method able to effectively tackle difficulties due to the network’s depth; then we further extend the algorithm proposing an online BCD scheme able to scale with respect to both the number of variables and the number of samples. We perform extensive numerical results on standard datasets using various deep networks. We show that the application of BCD methods to the training problem of DFNNs improves over standard batch/online algorithms in the training phase guaranteeing good generalization performance as well.

Keywords: Deep feedforward neural networks; Block coordinate decomposition; Online optimization; Large scale optimization (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s10898-019-00856-0 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:jglopt:v:77:y:2020:i:1:d:10.1007_s10898-019-00856-0

Ordering information: This journal article can be ordered from
http://www.springer. ... search/journal/10898

DOI: 10.1007/s10898-019-00856-0

Access Statistics for this article

Journal of Global Optimization is currently edited by Sergiy Butenko

More articles in Journal of Global Optimization from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:jglopt:v:77:y:2020:i:1:d:10.1007_s10898-019-00856-0