EconPapers    
Economics at your fingertips  
 

Comprehensive analysis of gradient-based hyperparameter optimization algorithms

O. Y. Bakhteev () and V. V. Strijov
Additional contact information
O. Y. Bakhteev: Moscow Institute of Physics and Technology
V. V. Strijov: Moscow Institute of Physics and Technology

Annals of Operations Research, 2020, vol. 289, issue 1, No 4, 65 pages

Abstract: Abstract The paper investigates hyperparameter optimization problem. Hyperparameters are the parameters of model parameter distribution. The adequate choice of hyperparameter values prevents model overfit and allows it to obtain higher predictive performance. Neural network models with large amount of hyperparameters are analyzed. The hyperparameter optimization for models is computationally expensive. The paper proposes modifications of various gradient-based methods to simultaneously optimize many hyperparameters. The paper compares the experiment results with the random search. The main impact of the paper is hyperparameter optimization algorithms analysis for the models with high amount of parameters. To select precise and stable models the authors suggest to use two model selection criteria: cross-validation and evidence lower bound. The experiments show that the models optimized using the evidence lower bound give higher error rate than the models obtained using cross-validation. These models also show greater stability when data is noisy. The evidence lower bound usage is preferable when the model tends to overfit or when the cross-validation is computationally expensive. The algorithms are evaluated on regression and classification datasets.

Keywords: Gradient descent; Hyperparameter optimization; Model selection; Neural networks; Classification; Regression (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s10479-019-03286-z Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:annopr:v:289:y:2020:i:1:d:10.1007_s10479-019-03286-z

Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10479

DOI: 10.1007/s10479-019-03286-z

Access Statistics for this article

Annals of Operations Research is currently edited by Endre Boros

More articles in Annals of Operations Research from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:annopr:v:289:y:2020:i:1:d:10.1007_s10479-019-03286-z