Generalization of Neural Networks on Second-Order Hypercomplex Numbers
Stanislav Pavlov,
Dmitry Kozlov (),
Mikhail Bakulin,
Aleksandr Zuev,
Andrey Latyshev and
Alexander Beliaev
Additional contact information
Stanislav Pavlov: NN AI Team, Huawei Russian Research Institute, St. Maksima Gorkogo, 117, Nizhny Novgorod 603006, Russia
Dmitry Kozlov: NN AI Team, Huawei Russian Research Institute, St. Maksima Gorkogo, 117, Nizhny Novgorod 603006, Russia
Mikhail Bakulin: NN AI Team, Huawei Russian Research Institute, St. Maksima Gorkogo, 117, Nizhny Novgorod 603006, Russia
Aleksandr Zuev: NN AI Team, Huawei Russian Research Institute, St. Maksima Gorkogo, 117, Nizhny Novgorod 603006, Russia
Andrey Latyshev: NN AI Team, Huawei Russian Research Institute, St. Maksima Gorkogo, 117, Nizhny Novgorod 603006, Russia
Alexander Beliaev: NN AI Team, Huawei Russian Research Institute, St. Maksima Gorkogo, 117, Nizhny Novgorod 603006, Russia
Mathematics, 2023, vol. 11, issue 18, 1-19
Abstract:
The vast majority of existing neural networks operate by rules set within the algebra of real numbers. However, as theoretical understanding of the fundamentals of neural networks and their practical applications grow stronger, new problems arise, which require going beyond such algebra. Various tasks come to light when the original data naturally have complex-valued formats. This situation is encouraging researchers to explore whether neural networks based on complex numbers can provide benefits over the ones limited to real numbers. Multiple recent works have been dedicated to developing the architecture and building blocks of complex-valued neural networks. In this paper, we generalize models by considering other types of hypercomplex numbers of the second order: dual and double numbers. We developed basic operators for these algebras, such as convolution, activation functions, and batch normalization, and rebuilt several real-valued networks to use them with these new algebras. We developed a general methodology for dual and double-valued gradient calculations based on Wirtinger derivatives for complex-valued functions. For classical computer vision (CIFAR-10, CIFAR-100, SVHN) and signal processing (G2Net, MusicNet) classification problems, our benchmarks show that the transition to the hypercomplex domain can be helpful in reaching higher values of metrics, compared to the original real-valued models.
Keywords: deep learning; hypercomplex neural networks; complex numbers; dual numbers; double numbers; hypercomplex norm; hypercomplex batch normalization (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/11/18/3973/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/18/3973/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:18:p:3973-:d:1242841
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().