Convex and concave envelopes of artificial neural network activation functions for deterministic global optimization
Matthew E. Wilhelm (),
Chenyu Wang () and
Matthew D. Stuber ()
Additional contact information
Matthew E. Wilhelm: University of Connecticut
Chenyu Wang: University of Connecticut
Matthew D. Stuber: University of Connecticut
Journal of Global Optimization, 2023, vol. 85, issue 3, No 2, 569-594
Abstract:
Abstract In this work, we present general methods to construct convex/concave relaxations of the activation functions that are commonly chosen for artificial neural networks (ANNs). The choice of these functions is often informed by both broader modeling considerations balanced with a need for high computational performance. The direct application of factorable programming techniques to compute bounds and convex/concave relaxations of such functions often lead to weak enclosures due to the dependency problem. Moreover, the piecewise formulation that defines several popular activation functions, prevents the computation of convex/concave relaxations as they violate the factorable function requirement. To improve the performance of relaxations of ANNs for deterministic global optimization applications, this study presents the development of a library of envelopes of the thoroughly studied rectifier-type and sigmoid activation functions, in addition to the novel self-gated sigmoid-weighted linear unit (SiLU) and Gaussian error linear unit activation functions. We demonstrate that the envelopes of activation functions directly lead to tighter relaxations of ANNs on their input domain. In turn, these improvements translate to a dramatic reduction in CPU runtime required for solving optimization problems involving ANN models to epsilon-global optimality. We further demonstrate that the factorable programming approach leads to superior computational performance over alternative state-of-the-art approaches.
Keywords: Artificial neural networks; Machine learning; Deterministic global optimization; Factorable programming; McCormick relaxations; Envelopes; Julia programming (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s10898-022-01228-x Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:jglopt:v:85:y:2023:i:3:d:10.1007_s10898-022-01228-x
Ordering information: This journal article can be ordered from
http://www.springer. ... search/journal/10898
DOI: 10.1007/s10898-022-01228-x
Access Statistics for this article
Journal of Global Optimization is currently edited by Sergiy Butenko
More articles in Journal of Global Optimization from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().