Monte Carlo (MC) simulations play a pivotal role in diverse scientific and engineering domains, with applications ranging from nuclear physics to materials science. Harnessing the computational power of high-performance computing (HPC) systems, especially Graphics Processing Units (GPUs), has become essential for accelerating MC simulations. This paper focuses on the adaptation and optimization of the OpenMC neutron and photon transport Monte Carlo code for Intel GPUs, specifically the Intel Data Center Max 1100 GPU (codename Ponte Vecchio, PVC), through distributed OpenMP offloading. Building upon prior work by Tramm J.R., et al. (2022), which laid the groundwork for GPU adaptation, our study meticulously extends the OpenMC code's capabilities to Intel GPUs. We present a comprehensive benchmarking and scaling analysis, comparing performance on Intel MAX GPUs to state-of-the-art CPU execution (Intel Xeon Platinum 8480+ Processor, codename 4th generation Sapphire Rapids). The results demonstrate a remarkable acceleration factor compared to CPU execution, showcasing the GPU-adapted code's superiority over its CPU counterpart as computational load increases.
翻译:蒙特卡洛(MC)模拟在从核物理到材料科学等众多科学和工程领域扮演着关键角色。利用高性能计算(HPC)系统的计算能力,特别是图形处理单元(GPU),已成为加速MC模拟的关键。本文聚焦于通过分布式OpenMP卸载,对OpenMC中子和光子输运蒙特卡洛代码进行适配和优化,使其能够在英特尔GPU(具体为英特尔Data Center Max 1100 GPU,代号Ponte Vecchio,PVC)上运行。基于Tramm J.R.等人(2022年)为GPU适配奠定基础的前期工作,本研究细致地将OpenMC代码的能力扩展至英特尔GPU。我们展示了全面的基准测试和扩展性分析,将英特尔MAX GPU上的性能与最先进的CPU执行(英特尔Xeon Platinum 8480+处理器,代号第四代Sapphire Rapids)进行了对比。结果表明,与CPU执行相比,加速效果显著,且随着计算负载的增加,经GPU适配的代码性能优于其CPU对应版本。