A New Likelihood Ratio Method for Training Artificial Neural Networks
Yijie Peng (),
Li Xiao (),
Bernd Heidergott (),
L. Jeff Hong () and
Henry Lam ()
Additional contact information
Yijie Peng: Department of Management Science and Information Systems, Guanghua School of Management, Peking University, Beijing 100871, China
Li Xiao: Key Laboratory of Intelligent Information Processing, Advanced Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China
Bernd Heidergott: Department of Operations Analytics, Vrije Universiteit Amsterdam, 1081 HV Amsterdam, Netherlands
L. Jeff Hong: Department of Management Science, School of Management, Fudan University, Shanghai 200433, China
Henry Lam: Department of Industrial Engineering and Operations Research, Columbia University, New York, New York 10027
INFORMS Journal on Computing, 2022, vol. 34, issue 1, 638-655
Abstract:
We investigate a new approach to compute the gradients of artificial neural networks (ANNs), based on the so-called push-out likelihood ratio method. Unlike the widely used backpropagation (BP) method that requires continuity of the loss function and the activation function, our approach bypasses this requirement by injecting artificial noises into the signals passed along the neurons. We show how this approach has a similar computational complexity as BP, and moreover is more advantageous in terms of removing the backward recursion and eliciting transparent formulas. We also formalize the connection between BP, a pivotal technique for training ANNs, and infinitesimal perturbation analysis, a classic path-wise derivative estimation approach, so that both our new proposed methods and BP can be better understood in the context of stochastic gradient estimation. Our approach allows efficient training for ANNs with more flexibility on the loss and activation functions, and shows empirical improvements on the robustness of ANNs under adversarial attacks and corruptions of natural noises. Summary of Contribution: Stochastic gradient estimation has been studied actively in simulation for decades and becomes more important in the era of machine learning and artificial intelligence. The stochastic gradient descent is a standard technique for training the artificial neural networks (ANNs), a pivotal problem in deep learning. The most popular stochastic gradient estimation technique is the backpropagation method. We find that the backpropagation method lies in the family of infinitesimal perturbation analysis, a path-wise gradient estimation technique in simulation. Moreover, we develop a new likelihood ratio-based method, another popular family of gradient estimation technique in simulation, for training more general ANNs, and demonstrate that the new training method can improve the robustness of the ANN.
Keywords: stochastic gradient estimation; artificial neural network; image identification (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://dx.doi.org/10.1287/ijoc.2021.1088 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:orijoc:v:34:y:2022:i:1:p:638-655
Access Statistics for this article
More articles in INFORMS Journal on Computing from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().