We propose a CPU-GPU heterogeneous computing method for solving time-evolution partial differential equation problems many times with guaranteed accuracy, in short time-to-solution and low energy-to-solution. On a single-GH200 node, the proposed method improved the computation speed by 86.4 and 8.67 times compared to the conventional method run only on CPU and only on GPU, respectively. Furthermore, the energy-to-solution was reduced by 32.2-fold (from 9944 J to 309 J) and 7.01-fold (from 2163 J to 309 J) when compared to using only the CPU and GPU, respectively. Using the proposed method on the Alps supercomputer, a 51.6-fold and 6.98-fold speedup was attained when compared to using only the CPU and GPU, respectively, and a high weak scaling efficiency of 94.3% was obtained up to 1,920 compute nodes. These implementations were realized using directive-based parallel programming models while enabling portability, indicating that directives are highly effective in analyses in heterogeneous computing environments.
翻译:我们提出了一种CPU-GPU异构计算方法,用于在保证精度的前提下,以较短求解时间和较低求解能耗多次求解时间演化偏微分方程问题。在单节点GH200上,该方法相比仅在CPU上运行的传统方法计算速度提升了86.4倍,相比仅在GPU上运行的传统方法提升了8.67倍。此外,求解能耗相比仅使用CPU时降低了32.2倍(从9944焦耳降至309焦耳),相比仅使用GPU时降低了7.01倍(从2163焦耳降至309焦耳)。在阿尔卑斯超级计算机上应用该方法时,相比仅使用CPU和仅使用GPU分别实现了51.6倍和6.98倍的加速比,并在扩展到1920个计算节点时仍保持94.3%的弱可扩展效率。这些实现均采用基于指令的并行编程模型完成,在保证可移植性的同时,证明了指令方法在异构计算环境分析中具有显著效能。