Analysis of cache behaviour and software optimizations for faster on-chip network simulations
B. M. Prabhu Prasad (),
Khyamling Parane () and
Basavaraj Talawar ()
Additional contact information
B. M. Prabhu Prasad: National Institute of Technology Karnataka, Surathkal
Khyamling Parane: National Institute of Technology Karnataka, Surathkal
Basavaraj Talawar: National Institute of Technology Karnataka, Surathkal
International Journal of System Assurance Engineering and Management, 2019, vol. 10, issue 4, No 20, 696-712
Abstract:
Abstract Fast simulations are critical in reducing time to market in chip multiprocessors and system-on-chips. Several simulators have been used to evaluate the performance and power consumed by network-on-chips (NoCs). To speedup the simulations, it is necessary to investigate and optimize the hotspots in the simulator source code. Among several simulators available, Booksim2.0 has been chosen for the experimentation as it is being extensively used in the NoC community. In this paper, the cache and memory system behavior of Booksim2.0 have been analyzed to accurately monitor input dependent performance bottlenecks. The measurements show that cache and memory usage patterns vary widely based on the input parameters given to Booksim2.0. Based on these measurements, the cache configuration having the least misses has been identified. To further reduce the cache misses, software optimization techniques such as removal of unused functions, loop interchanging and replacing post-increment operator with pre-increment operator for non-primitive data types have been employed. The cache misses were reduced by 18.52%, 5.34% and 3.91% by employing above technology respectively. Thread parallelization and vectorization have been employed to improve the overall performance of Booksim2.0. The OpenMP programming model and SIMD are used for parallelizing and vectorizing the more time-consuming portions of Booksim2.0. Speedups of 2.93× and 3.97× were observed for the Mesh topology with $$30\times 30$$ 30 × 30 network size by employing thread parallelization and vectorization respectively.
Keywords: Cache behaviour; Hotspot analysis; Network-on-chip; Parallelization; Performance profiling (search for similar items in EconPapers)
Date: 2019
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s13198-019-00799-5 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:ijsaem:v:10:y:2019:i:4:d:10.1007_s13198-019-00799-5
Ordering information: This journal article can be ordered from
http://www.springer.com/engineering/journal/13198
DOI: 10.1007/s13198-019-00799-5
Access Statistics for this article
International Journal of System Assurance Engineering and Management is currently edited by P.K. Kapur, A.K. Verma and U. Kumar
More articles in International Journal of System Assurance Engineering and Management from Springer, The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().