EconPapers    
Economics at your fingertips  
 

On the Detection and Interpretation of Performance Variations of HPC Applications

Dennis Hoppe (), Li Zhong (), Stefan Andersson () and Diana Moise ()
Additional contact information
Dennis Hoppe: High Performance Computing Center Stuttgart
Li Zhong: High Performance Computing Center Stuttgart
Stefan Andersson: Amazon Web Services (AWS)
Diana Moise: Cray Inc.

A chapter in Sustained Simulation Performance 2018 and 2019, 2020, pp 41-56 from Springer

Abstract: Abstract Supercomputers are synonymous with maximum performance, and thus one would expect that each run of an parallel applications would yield the same runtime provided that input parameters and data are unchanged. Practice, however, clearly demonstrates that this is not the case. Supercomputers are built with multi-user usage in mind, meaning that typically several hundred applications run simultaneously on a multitude of compute nodes. Although these compute nodes are assigned exclusively to users, network and data storage is shared among all; interferences between applications are inevitable. In this paper, we evaluate application runs on a Cray XC40 system. The objective is to identify so-called aggressor applications having a negative impact on the performance of simultaneously running applications resulting in unforeseeable longer runtimes. We discuss in this paper characteristics of aggressors and victims, as well as introduce several detection strategies to identify these victims, and thus also potential aggressors. Finally, a study demonstrates the effectiveness of the approach by identifying an aggressor and optimizing the source code, which resulted in less interference.

Date: 2020
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:sprchp:978-3-030-39181-2_5

Ordering information: This item can be ordered from
http://www.springer.com/9783030391812

DOI: 10.1007/978-3-030-39181-2_5

Access Statistics for this chapter

More chapters in Springer Books from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2026-02-19
Handle: RePEc:spr:sprchp:978-3-030-39181-2_5