Two Novel Non-Uniform Quantizers with Application in Post-Training Quantization

Perić, Zoran; Aleksić, Danijela; Nikolić, Jelena; Tomić, Stefan

Two Novel Non-Uniform Quantizers with Application in Post-Training Quantization

Zoran Perić, Danijela Aleksić (), Jelena Nikolić and Stefan Tomić
Additional contact information
Zoran Perić: Faculty of Electronic Engineering, University of Nis, Aleksandra Medvedeva 14, 18000 Nis, Serbia
Danijela Aleksić: Department of Mobile Network Nis, Telekom Srbija, Vozdova 11, 18000 Nis, Serbia
Jelena Nikolić: Faculty of Electronic Engineering, University of Nis, Aleksandra Medvedeva 14, 18000 Nis, Serbia
Stefan Tomić: School of Engineering and Technology, Al Dar University College, Dubai P.O. Box 35529, United Arab Emirates

Mathematics, 2022, vol. 10, issue 19, 1-21

Abstract: With increased network downsizing and cost minimization in deployment of neural network (NN) models, the utilization of edge computing takes a significant place in modern artificial intelligence today. To bridge the memory constraints of less-capable edge systems, a plethora of quantizer models and quantization techniques are proposed for NN compression with the goal of enabling the fitting of the quantized NN (QNN) on the edge device and guaranteeing a high extent of accuracy preservation. NN compression by means of post-training quantization has attracted a lot of research attention, where the efficiency of uniform quantizers (UQs) has been promoted and heavily exploited. In this paper, we propose two novel non-uniform quantizers (NUQs) that prudently utilize one of the two properties of the simplest UQ. Although having the same quantization rule for specifying the support region, both NUQs have a different starting setting in terms of cell width, compared to a standard UQ. The first quantizer, named the simplest power-of-two quantizer (SPTQ), defines the width of cells that are multiplied by the power of two. As it is the case in the simplest UQ design, the representation levels of SPTQ are midpoints of the quantization cells. The second quantizer, named the modified SPTQ (MSPTQ), is a more competitive quantizer model, representing an enhanced version of SPTQ in which the quantizer decision thresholds are centered between the nearest representation levels, similar to the UQ design. These properties make the novel NUQs relatively simple. Unlike UQ, the quantization cells of MSPTQ are not of equal widths and the representation levels are not midpoints of the quantization cells. In this paper, we describe the design procedure of SPTQ and MSPTQ and we perform their optimization for the assumed Laplacian source. Afterwards, we perform post-training quantization by implementing SPTQ and MSPTQ, study the viability of QNN accuracy and show the implementation benefits over the case where UQ of an equal number of quantization cells is utilized in QNN for the same classification task. We believe that both NUQs are particularly substantial for memory-constrained environments, where simple and acceptably accurate solutions are of crucial importance.

Keywords: non-uniform quantization; support region; post-training quantization; quantized neural networks (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/10/19/3435/pdf (application/pdf)
https://www.mdpi.com/2227-7390/10/19/3435/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:10:y:2022:i:19:p:3435-:d:921130

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().