PARS: Proxy-Based Automatic Rank Selection for Neural Network Compression via Low-Rank Weight Approximation

Sobolev, Konstantin; Ermilov, Dmitry; Phan, Anh-Huy; Cichocki, Andrzej

PARS: Proxy-Based Automatic Rank Selection for Neural Network Compression via Low-Rank Weight Approximation

Konstantin Sobolev (), Dmitry Ermilov, Anh-Huy Phan and Andrzej Cichocki
Additional contact information
Konstantin Sobolev: Skolkovo Institute of Science and Technology, 121205 Moscow, Russia
Dmitry Ermilov: Skolkovo Institute of Science and Technology, 121205 Moscow, Russia
Anh-Huy Phan: Skolkovo Institute of Science and Technology, 121205 Moscow, Russia
Andrzej Cichocki: Skolkovo Institute of Science and Technology, 121205 Moscow, Russia

Mathematics, 2022, vol. 10, issue 20, 1-22

Abstract: Low-rank matrix/tensor decompositions are promising methods for reducing the inference time, computation, and memory consumption of deep neural networks (DNNs). This group of methods decomposes the pre-trained neural network weights through low-rank matrix/tensor decomposition and replaces the original layers with lightweight factorized layers. A main drawback of the technique is that it demands a great amount of time and effort to select the best ranks of tensor decomposition for each layer in a DNN. This paper proposes a P roxy-based A utomatic tensor R ank S election method ( PARS ) that utilizes a Bayesian optimization approach to find the best combination of ranks for neural network (NN) compression. We observe that the decomposition of weight tensors adversely influences the feature distribution inside the neural network and impairs the predictability of the post-compression DNN performance. Based on this finding, a novel proxy metric is proposed to deal with the abovementioned issue and to increase the quality of the rank search procedure. Experimental results show that PARS improves the results of existing decomposition methods on several representative NNs, including ResNet-18 , ResNet-56 , VGG-16 , and AlexNet . We obtain a 3 × FLOP reduction with almost no loss of accuracy for ILSVRC-2012ResNet-18 and a 5.5 × FLOP reduction with an accuracy improvement for ILSVRC-2012 VGG-16 .

Keywords: convolutional neural network acceleration; deep learning; low-rank tensor decomposition; rank selection (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2022
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/10/20/3801/pdf (application/pdf)
https://www.mdpi.com/2227-7390/10/20/3801/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:10:y:2022:i:20:p:3801-:d:942932

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().