Convex and concave envelopes of artificial neural network activation functions for deterministic global optimization

Wilhelm, Matthew E.; Wang, Chenyu; Stuber, Matthew D.

Convex and concave envelopes of artificial neural network activation functions for deterministic global optimization

Matthew E. Wilhelm (), Chenyu Wang () and Matthew D. Stuber ()
Additional contact information
Matthew E. Wilhelm: University of Connecticut
Chenyu Wang: University of Connecticut
Matthew D. Stuber: University of Connecticut

Journal of Global Optimization, 2023, vol. 85, issue 3, No 2, 569-594

Abstract: Abstract In this work, we present general methods to construct convex/concave relaxations of the activation functions that are commonly chosen for artificial neural networks (ANNs). The choice of these functions is often informed by both broader modeling considerations balanced with a need for high computational performance. The direct application of factorable programming techniques to compute bounds and convex/concave relaxations of such functions often lead to weak enclosures due to the dependency problem. Moreover, the piecewise formulation that defines several popular activation functions, prevents the computation of convex/concave relaxations as they violate the factorable function requirement. To improve the performance of relaxations of ANNs for deterministic global optimization applications, this study presents the development of a library of envelopes of the thoroughly studied rectifier-type and sigmoid activation functions, in addition to the novel self-gated sigmoid-weighted linear unit (SiLU) and Gaussian error linear unit activation functions. We demonstrate that the envelopes of activation functions directly lead to tighter relaxations of ANNs on their input domain. In turn, these improvements translate to a dramatic reduction in CPU runtime required for solving optimization problems involving ANN models to epsilon-global optimality. We further demonstrate that the factorable programming approach leads to superior computational performance over alternative state-of-the-art approaches.

Keywords: Artificial neural networks; Machine learning; Deterministic global optimization; Factorable programming; McCormick relaxations; Envelopes; Julia programming (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s10898-022-01228-x Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:jglopt:v:85:y:2023:i:3:d:10.1007_s10898-022-01228-x

Ordering information: This journal article can be ordered from
http://www.springer. ... search/journal/10898

DOI: 10.1007/s10898-022-01228-x

Access Statistics for this article

Journal of Global Optimization is currently edited by Sergiy Butenko

More articles in Journal of Global Optimization from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().