On Block g -Circulant Matrices with Discrete Cosine and Sine Transforms for Transformer-Based Translation Machine
Euis Asriani (),
Intan Muchtadi-Alamsyah and
Ayu Purwarianti
Additional contact information
Euis Asriani: Doctoral Program of Mathematics, Faculty of Mathematics and Natural Sciences, Institut Teknologi Bandung, Bandung 40132, Indonesia
Intan Muchtadi-Alamsyah: Algebra Research Group, Faculty of Mathematics and Natural Sciences, Institut Teknologi Bandung, Bandung 40132, Indonesia
Ayu Purwarianti: University Center of Excellence Artificial Intelligence on Vision, Natural Language Processing and Big Data Analytics (U-CoE AI-VLB), Institut Teknologi Bandung, Bandung 40132, Indonesia
Mathematics, 2024, vol. 12, issue 11, 1-19
Abstract:
Transformer has emerged as one of the modern neural networks that has been applied in numerous applications. However, transformers’ large and deep architecture makes them computationally and memory-intensive. In this paper, we propose the block g -circulant matrices to replace the dense weight matrices in the feedforward layers of the transformer and leverage the DCT-DST algorithm to multiply these matrices with the input vector. Our test using Portuguese-English datasets shows that the suggested method improves model memory efficiency compared to the dense transformer but at the cost of a slight drop in accuracy. We found that the model Dense-block 1-circulant DCT-DST of 128 dimensions achieved the highest model memory efficiency at 22.14%. We further show that the same model achieved a BLEU score of 26.47%.
Keywords: transformer; block g-circulant matrices; DCT-DST algorithm; kronecker product (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/12/11/1697/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/11/1697/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:11:p:1697-:d:1405039
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().