For many years, systems running Nvidia-based GPU architectures have dominated the heterogeneous supercomputer landscape. However, recently GPU chipsets manufactured by Intel and AMD have cut into this market and can now be found in some of the worlds fastest supercomputers. The June 2023 edition of the TOP500 list of supercomputers ranks the Frontier supercomputer at the Oak Ridge National Laboratory in Tennessee as the top system in the world. This system features AMD Instinct 250 X GPUs and is currently the only true exascale computer in the world.The first framework that enabled support for heterogeneous platforms across multiple hardware vendors was OpenCL, in 2009. Since then a number of frameworks have been developed to support vendor agnostic heterogeneous environments including OpenMP, OpenCL, Kokkos, and SYCL. SYCL, which combines the concepts of OpenCL with the flexibility of single-source C++, is one of the more promising programming models for heterogeneous computing devices. One key advantage of this framework is that it provides a higher-level programming interface that abstracts away many of the hardware details than the other frameworks. This makes SYCL easier to learn and to maintain across multiple architectures and vendors. In n recent years, there has been growing interest in using heterogeneous computing architectures to accelerate molecular dynamics simulations. Some of the more popular molecular dynamics simulations include Amber, NAMD, and Gromacs. However, to the best of our knowledge, only Gromacs has been successfully ported to SYCL to date. In this paper, we compare the performance of GROMACS compiled using the SYCL and CUDA frameworks for a variety of standard GROMACS benchmarks. In addition, we compare its performance across three different Nvidia GPU chipsets, P100, V100, and A100.
翻译:多年来,基于英伟达GPU架构的系统主导着异构超级计算领域。然而,近期英特尔和AMD制造的GPU芯片组已切入该市场,并出现在全球部分最快的超级计算机中。2023年6月发布的TOP500超级计算机榜单显示,位于田纳西州橡树岭国家实验室的Frontier超级计算机位列全球第一。该系统搭载AMD Instinct 250 X GPU,是目前全球唯一的百亿亿次级计算机。首个支持跨多硬件厂商异构平台的计算框架是2009年发布的OpenCL。此后,为支持厂商无关的异构环境,陆续开发了包括OpenMP、OpenCL、Kokkos和SYCL在内的多种框架。SYCL融合了OpenCL的并行概念与单源C++的灵活性,是异构计算设备领域较具前景的编程模型之一。该框架的关键优势在于提供了更高层次的编程接口,相比其他框架能更好地抽象硬件细节,这使得SYCL更易于学习,并能在多架构和多厂商环境中保持代码可维护性。近年来,利用异构计算架构加速分子动力学模拟的研究日益增多。当前主流的分子动力学模拟软件包括Amber、NAMD和Gromacs。然而据我们所知,目前仅有Gromacs成功移植到SYCL平台。本文针对多种标准GROMACS基准测试,比较了基于SYCL和CUDA框架编译的GROMACS性能表现。此外,我们还评估了其在三种不同英伟达GPU芯片组(P100、V100和A100)上的性能差异。