Hybrid GPU–CPU Efficient Implementation of a Parallel Numerical Algorithm for Solving the Cauchy Problem for a Nonlinear Differential Riccati Equation of Fractional Variable Order

Tverdyi, Dmitrii; Parovik, Roman

Hybrid GPU–CPU Efficient Implementation of a Parallel Numerical Algorithm for Solving the Cauchy Problem for a Nonlinear Differential Riccati Equation of Fractional Variable Order

Dmitrii Tverdyi and Roman Parovik ()
Additional contact information
Dmitrii Tverdyi: Institute of Cosmophysical Research and Radio Wave Propagation FEB RAS, Paratunka, Mirnaya Street 7, 684034 Kamchatka, Russia
Roman Parovik: Institute of Cosmophysical Research and Radio Wave Propagation FEB RAS, Paratunka, Mirnaya Street 7, 684034 Kamchatka, Russia

Mathematics, 2023, vol. 11, issue 15, 1-21

Abstract: The numerical solution for fractional dynamics problems can create a high computational load, which makes it necessary to implement efficient algorithms for their solution. The main contribution to the computational load of such computations is created by heredity (memory), which is determined by the dependence of the current value of the solution function on previous values in the time interval. In terms of mathematics, the heredity here is described using a fractional differentiation operator in the Gerasimov–Caputo sense of variable order. As an example, we consider the Cauchy problem for the non-linear fractional Riccati equation with non-constant coefficients. An efficient parallel implementation algorithm has been proposed for the known sequential non-local explicit finite-difference numerical solution scheme. This implementation of the algorithm is a hybrid one, since it uses both GPU and CPU computational nodes. The program code of the parallel implementation of the algorithm is described in C and CUDA C languages, and is developed using OpenMP and CUDA hardware, as well as software architectures. This paper presents a study on the computational efficiency of the proposed parallel algorithm based on data from a series of computational experiments that were obtained using a computing server NVIDIA DGX STATION. The average computation time is analyzed in terms of the following: running time, acceleration, efficiency, and the cost of the algorithm. As a result, it is shown on test examples that the hybrid version of the numerical algorithm can give a significant performance increase of 3–5 times in comparison with both the sequential version of the algorithm and OpenMP implementation.

Keywords: explicit finite-difference schemes; parallel computing; CUDA; OpenMP; fractional derivatives; memory effect; hereditary; non-linear fractional equations (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/11/15/3358/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/15/3358/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:15:p:3358-:d:1207670

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().