Portable multi-node LQCD Monte Carlo simulations using OpenACC
Claudio Bonati (),
Enrico Calore (),
Massimo D’Elia (),
Michele Mesiti (),
Francesco Negro (),
Francesco Sanfilippo (),
Sebastiano Fabio Schifano,
Giorgio Silvi () and
Raffaele Tripiccione ()
Additional contact information
Claudio Bonati: Università di Pisa and INFN Sezione di Pisa, Largo Pontecorvo 3, I-56127 Pisa, Italy
Enrico Calore: #x2020;,§§Università degli Studi di Ferrara and INFN Sezione di Ferrara, Via Saragat 1, I-44122 Ferrara, Italy
Massimo D’Elia: Università di Pisa and INFN Sezione di Pisa, Largo Pontecorvo 3, I-56127 Pisa, Italy
Michele Mesiti: #x2021;Academy of Advanced Computing, Swansea University, Singleton Park, Swansea SA2 8PP, UK
Francesco Negro: #x2020;†INFN Sezione di Pisa, Largo Pontecorvo 3, I-56127 Pisa, Italy
Francesco Sanfilippo: #x2021;‡INFN Sezione di Roma3, Via della Vasca Navale 84, I-00146 Roma, Italy
Sebastiano Fabio Schifano: #x2020;,§§Università degli Studi di Ferrara and INFN Sezione di Ferrara, Via Saragat 1, I-44122 Ferrara, Italy
Giorgio Silvi: #xB6;¶Jülich Supercomputing Centre, Forschungszentrum Jülich, Wilhelm-Johnen-Straße, 52428 Jülich, Germany
Raffaele Tripiccione: #x2020;,§§Università degli Studi di Ferrara and INFN Sezione di Ferrara, Via Saragat 1, I-44122 Ferrara, Italy
International Journal of Modern Physics C (IJMPC), 2018, vol. 29, issue 01, 1-21
Abstract:
This paper describes a state-of-the-art parallel Lattice QCD Monte Carlo code for staggered fermions, purposely designed to be portable across different computer architectures, including GPUs and commodity CPUs. Portability is achieved using the OpenACC parallel programming model, used to develop a code that can be compiled for several processor architectures. The paper focuses on parallelization on multiple computing nodes using OpenACC to manage parallelism within the node, and OpenMPI to manage parallelism among the nodes. We first discuss the available strategies to be adopted to maximize performances, we then describe selected relevant details of the code, and finally measure the level of performance and scaling-performance that we are able to achieve. The work focuses mainly on GPUs, which offer a significantly high level of performances for this application, but also compares with results measured on other processors.
Keywords: Lattice-QCD; OpenACC; portability; MPI; GPU (search for similar items in EconPapers)
Date: 2018
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0129183118500109
Access to full text is restricted to subscribers
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wsi:ijmpcx:v:29:y:2018:i:01:n:s0129183118500109
Ordering information: This journal article can be ordered from
DOI: 10.1142/S0129183118500109
Access Statistics for this article
International Journal of Modern Physics C (IJMPC) is currently edited by H. J. Herrmann
More articles in International Journal of Modern Physics C (IJMPC) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().