Two-level parallel load balancing strategy for accelerating DSMC simulations in near-continuum gases
Chenxiang Xiao,
Chenchen Zhang (),
Bin Zhang,
Hui Xu () and
Hong Liu ()
Additional contact information
Chenxiang Xiao: School of Aeronautics and Astronautics, Shanghai Jiao Tong University, 800 Dong Chuan Road, Shanghai 200240, P. R. China
Chenchen Zhang: School of Mathematical Sciences, Peking University, Beijing 100871, P. R. China
Bin Zhang: School of Aeronautics and Astronautics, Shanghai Jiao Tong University, 800 Dong Chuan Road, Shanghai 200240, P. R. China†Sichuan Research Institute, Shanghai Jiao Tong University, Chengdu 610213, P. R. China
Hui Xu: School of Aeronautics and Astronautics, Shanghai Jiao Tong University, 800 Dong Chuan Road, Shanghai 200240, P. R. China
Hong Liu: School of Aeronautics and Astronautics, Shanghai Jiao Tong University, 800 Dong Chuan Road, Shanghai 200240, P. R. China
International Journal of Modern Physics C (IJMPC), 2025, vol. 36, issue 03, 1-17
Abstract:
The Direct Simulation Monte Carlo (DSMC) algorithm is widely employed for simulating rarefied gas flows and is increasingly applied in near-continuum regimes for research and engineering purposes. However, its computational demands, notably load imbalance and extended simulation time, hinder widespread adoption. Addressing these challenges, this paper introduces the Two-Level parallel load balancing strategy. This novel approach combines thread-level and multi-process parallelism to enhance load balancing and reduce simulation time. Key features include a thread-level load-decoupling strategy implemented via OpenMP and a multi-process load balancing mechanism employing distributed memory via MPI. Building upon our previous PartPlusColl [L. Li, W. Ren and B. Zhang, J. Aeronaut. Astronaut. Aviat. Ser. A 46, 88 (2014)] approach, the load balancing mechanism utilizes Stop At Risk (SAR) criteria for repartitioning with METIS. Additionally, a specialized data transmission mechanism utilizing MPI nonblocking communication minimizes global communication between processes. Validation and evaluation are performed using four hypersonic flow cases around a cylinder and sphere, demonstrating significant improvements. Notably, the proposed strategy achieves 30% enhancement over the PartPlusColl strategy under 512 CPU cores compared to 16 CPU cores, and reduces between-process communication time with 33.57%. These advancements contribute to enhancing the effectiveness of the DSMC algorithm in near-continuum aerodynamic simulations.
Keywords: DSMC; MPI/OpenMP; load balance; nonblock communication (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0129183124501985
Access to full text is restricted to subscribers
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wsi:ijmpcx:v:36:y:2025:i:03:n:s0129183124501985
Ordering information: This journal article can be ordered from
DOI: 10.1142/S0129183124501985
Access Statistics for this article
International Journal of Modern Physics C (IJMPC) is currently edited by H. J. Herrmann
More articles in International Journal of Modern Physics C (IJMPC) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().