Reasoning the Runtime Overhead of Profiling Tools
Mikhail Zarubin () and
Bert Wesarg ()
Additional contact information
Mikhail Zarubin: TUD Dresden University of Technology, Center for Interdisciplinary Digital Sciences (CIDS), Department Information Services and High Performance Computing (ZIH)
Bert Wesarg: GWT-TUD GmbH
A chapter in Tools for High Performance Computing 2023, 2026, pp 1-17 from Springer
Abstract:
Abstract Modern high-performance computing (HPC) depends on an ever-evolving hardware landscape. Supercomputers, typically composed of hundreds to thousands of heterogeneous computing units, are further complicated by a variety of available memory and storage architectures. Consequently, efficient HPC-oriented parallel data processing and computation are becoming increasingly complex, even for experienced users. To address this important challenge and streamline the deployment of modern HPC resources, academia and industry have created various performance analysis and profiling tools. The aim of these tools is to collect information on program execution, enabling informed decisions and program optimization. A recent study naively applied a variety of profiling tools on well-known MPI-centric HPC proxy applications and compared the generated runtime overhead, memory consumption, and call path data. Results demonstrated that instrumentation-based tools, like Score-P and TAU, have limitations. We reproduced the experiments and identified the causes for these disadvantages. First, we enhanced libunwind, responsible for collecting backtraces from a signal handler, to make it more reliable. These enhancements enabled Score-P to produce results for all experiment configurations, but also improved the overhead for one benchmark. Second, we show that a proxy application exercises a common MPI communication pattern that induces a high overhead for instrumentation-based tools. We re-ran all of the experiments using a version of this proxy benchmark that was adjusted for ORNL’s Frontier system. This adjusted version also reduced the overhead of the instrumentation-centric tools. Finally, if the tools are used as advertised by the tool developers, all tools work with acceptable runtime overhead, regardless of the profiling technique used.
Keywords: HPC; Profiling and tracing tools; Performance analysis; Parallel performance (search for similar items in EconPapers)
Date: 2026
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:sprchp:978-3-032-16397-4_1
Ordering information: This item can be ordered from
http://www.springer.com/9783032163974
DOI: 10.1007/978-3-032-16397-4_1
Access Statistics for this chapter
More chapters in Springer Books from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().