EconPapers    
Economics at your fingertips  
 

Bolstering stochastic gradient descent with model building

Ş. İlker Birbil (), Özgür Martin, Gönenç Onay and Figen Öztoprak
Additional contact information
Ş. İlker Birbil: University of Amsterdam
Özgür Martin: Mimar Sinan Fine Arts University
Gönenç Onay: Galatasaray University
Figen Öztoprak: Gebze Technical University

TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, 2024, vol. 32, issue 3, No 7, 517-536

Abstract: Abstract Stochastic gradient descent method and its variants constitute the core optimization algorithms that achieve good convergence rates for solving machine learning problems. These rates are obtained especially when these algorithms are fine-tuned for the application at hand. Although this tuning process can require large computational costs, recent work has shown that these costs can be reduced by line search methods that iteratively adjust the step length. We propose an alternative approach to stochastic line search by using a new algorithm based on forward step model building. This model building step incorporates second-order information that allows adjusting not only the step length but also the search direction. Noting that deep learning model parameters come in groups (layers of tensors), our method builds its model and calculates a new step for each parameter group. This novel diagonalization approach makes the selected step lengths adaptive. We provide convergence rate analysis, and experimentally show that the proposed algorithm achieves faster convergence and better generalization in well-known test problems. More precisely, SMB requires less tuning, and shows comparable performance to other adaptive methods.

Keywords: Model building; Second-order information; Stochastic gradient descent; Convergence analysis; 90C26; 90C06; 90C30; 90C15 (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://link.springer.com/10.1007/s11750-024-00673-z Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:topjnl:v:32:y:2024:i:3:d:10.1007_s11750-024-00673-z

Ordering information: This journal article can be ordered from
http://link.springer.de/orders.htm

DOI: 10.1007/s11750-024-00673-z

Access Statistics for this article

TOP: An Official Journal of the Spanish Society of Statistics and Operations Research is currently edited by Juan José Salazar González and Gustavo Bergantiños

More articles in TOP: An Official Journal of the Spanish Society of Statistics and Operations Research from Springer, Sociedad de Estadística e Investigación Operativa
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:topjnl:v:32:y:2024:i:3:d:10.1007_s11750-024-00673-z