Portable multi-node LQCD Monte Carlo simulations using OpenACC

Bonati, Claudio; Calore, Enrico; D’Elia, Massimo; Mesiti, Michele; Negro, Francesco; Sanfilippo, Francesco; Schifano, Sebastiano Fabio; Silvi, Giorgio; Tripiccione, Raffaele

Portable multi-node LQCD Monte Carlo simulations using OpenACC

Claudio Bonati (), Enrico Calore (), Massimo D’Elia (), Michele Mesiti (), Francesco Negro (), Francesco Sanfilippo (), Sebastiano Fabio Schifano, Giorgio Silvi () and Raffaele Tripiccione ()
Additional contact information
Claudio Bonati: Università di Pisa and INFN Sezione di Pisa, Largo Pontecorvo 3, I-56127 Pisa, Italy
Enrico Calore: #x2020;,§§Università degli Studi di Ferrara and INFN Sezione di Ferrara, Via Saragat 1, I-44122 Ferrara, Italy
Massimo D’Elia: Università di Pisa and INFN Sezione di Pisa, Largo Pontecorvo 3, I-56127 Pisa, Italy
Michele Mesiti: #x2021;Academy of Advanced Computing, Swansea University, Singleton Park, Swansea SA2 8PP, UK
Francesco Negro: #x2020;†INFN Sezione di Pisa, Largo Pontecorvo 3, I-56127 Pisa, Italy
Francesco Sanfilippo: #x2021;‡INFN Sezione di Roma3, Via della Vasca Navale 84, I-00146 Roma, Italy
Sebastiano Fabio Schifano: #x2020;,§§Università degli Studi di Ferrara and INFN Sezione di Ferrara, Via Saragat 1, I-44122 Ferrara, Italy
Giorgio Silvi: #xB6;¶Jülich Supercomputing Centre, Forschungszentrum Jülich, Wilhelm-Johnen-Straße, 52428 Jülich, Germany
Raffaele Tripiccione: #x2020;,§§Università degli Studi di Ferrara and INFN Sezione di Ferrara, Via Saragat 1, I-44122 Ferrara, Italy

International Journal of Modern Physics C (IJMPC), 2018, vol. 29, issue 01, 1-21

Abstract: This paper describes a state-of-the-art parallel Lattice QCD Monte Carlo code for staggered fermions, purposely designed to be portable across different computer architectures, including GPUs and commodity CPUs. Portability is achieved using the OpenACC parallel programming model, used to develop a code that can be compiled for several processor architectures. The paper focuses on parallelization on multiple computing nodes using OpenACC to manage parallelism within the node, and OpenMPI to manage parallelism among the nodes. We first discuss the available strategies to be adopted to maximize performances, we then describe selected relevant details of the code, and finally measure the level of performance and scaling-performance that we are able to achieve. The work focuses mainly on GPUs, which offer a significantly high level of performances for this application, but also compares with results measured on other processors.

Keywords: Lattice-QCD; OpenACC; portability; MPI; GPU (search for similar items in EconPapers)
Date: 2018
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0129183118500109
Access to full text is restricted to subscribers

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:wsi:ijmpcx:v:29:y:2018:i:01:n:s0129183118500109

Ordering information: This journal article can be ordered from

DOI: 10.1142/S0129183118500109

Access Statistics for this article

International Journal of Modern Physics C (IJMPC) is currently edited by H. J. Herrmann

More articles in International Journal of Modern Physics C (IJMPC) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().