The HOPSA Workflow and Tools
Bernd Mohr (),
Vladimir Voevodin,
Judit Giménez,
Erik Hagersten,
Andreas Knüpfer,
Dmitry A. Nikitenko,
Mats Nilsson,
Harald Servat,
Aamer Shah,
Frank Winkler,
Felix Wolf and
Ilya Zhukov
Additional contact information
Bernd Mohr: Forschungszentrum Jülich GmbH, Jülich Supercomputing Centre
Vladimir Voevodin: Moscow State University, RCC
Judit Giménez: Barcelona Supercomputing Centre
Erik Hagersten: Rogue Wave Software AB
Andreas Knüpfer: Technical University Dresden
Dmitry A. Nikitenko: Moscow State University, RCC
Mats Nilsson: Rogue Wave Software AB
Harald Servat: Barcelona Supercomputing Centre
Aamer Shah: German Research School for Simulation Sciences GmbH / RWTH Aachen University
Frank Winkler: Technical University Dresden
Felix Wolf: German Research School for Simulation Sciences GmbH / RWTH Aachen University
Ilya Zhukov: Forschungszentrum Jülich GmbH, Jülich Supercomputing Centre
A chapter in Tools for High Performance Computing 2012, 2013, pp 127-146 from Springer
Abstract:
Abstract To maximise the scientific output of a high-performance computing system, different stakeholders pursue different strategies. While individual application developers are trying to shorten the time to solution by optimising their codes, system administrators are tuning the configuration of the overall system to increase its throughput. Yet, the complexity of today’s machines with their strong interrelationship between application and system performance presents serious challenges to achieving these goals. The HOPSA project (HOlistic Performance System Analysis) therefore sets out to create an integrated diagnostic infrastructure for combined application and system-level tuning – with the former provided by the EU and the latter by the Russian project partners. Starting from system-wide basic performance screening of individual jobs, an automated workflow routes findings on potential bottlenecks either to application developers or system administrators with recommendations on how to identify their root cause using more powerful diagnostic tools. Developers can choose from a variety of mature performance-analysis tools developed by our consortium. Within this project, the tools will be further integrated and enhanced with respect to scalability, depth of analysis, and support for asynchronous tasking, a node-level paradigm playing an increasingly important role in hybrid programs on emerging hierarchical and heterogeneous systems.
Keywords: Performance Data; Performance Behaviour; System Administrator; Trace Format; Event Trace (search for similar items in EconPapers)
Date: 2013
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:sprchp:978-3-642-37349-7_9
Ordering information: This item can be ordered from
http://www.springer.com/9783642373497
DOI: 10.1007/978-3-642-37349-7_9
Access Statistics for this chapter
More chapters in Springer Books from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().