Utilizing the Attention Mechanism for Accuracy Prediction in Quantized Neural Networks
Lu Wei,
Zhong Ma (),
Chaojie Yang,
Qin Yao and
Wei Zheng
Additional contact information
Lu Wei: School of Software, Northwestern Polytechnical University, Xi’an 710072, China
Zhong Ma: Third Technical Department, Xi’an Microelectronics Technology Institute, Xi’an 710065, China
Chaojie Yang: Third Technical Department, Xi’an Microelectronics Technology Institute, Xi’an 710065, China
Qin Yao: School of Software, Northwestern Polytechnical University, Xi’an 710072, China
Wei Zheng: School of Software, Northwestern Polytechnical University, Xi’an 710072, China
Mathematics, 2025, vol. 13, issue 5, 1-20
Abstract:
Quantization plays a crucial role in deploying neural network models on resource-limited hardware. However, current quantization methods have issues like the large accuracy loss and poor generalization for complex tasks. These issues pose obstacles to the practical application of deep learning and large language models in smart systems. The main problem is our limited understanding of quantization’s effect on accuracy, and there is also a need for more effective approaches to evaluate the performance of the quantized models. To address these concerns, we develop a novel method that leverages the self-attention mechanism. This method predicts a quantized model’s accuracy using a single representative image from the test set. It utilizes the transformer encoder and decoder to perform this prediction. The prediction error of the quantization accuracy on three types of neural network models is 2.44%. The proposed method enables rapid performance assessment of the quantized models during the development stage, thereby facilitating the optimization of the quantization parameters and promoting the practical application of neural network models.
Keywords: neural network; quantization; accuracy; prediction; attention (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/13/5/732/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/5/732/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:5:p:732-:d:1598563
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().