The evolution of the computing landscape has resulted in the proliferation of diverse hardware architectures, with different flavors of GPUs and other compute accelerators becoming more widely available. To facilitate the efficient use of these architectures in a heterogeneous computing environment, several programming models are available to enable portability and performance across different computing systems, such as Kokkos, SYCL, OpenMP and others. As part of the High Energy Physics Center for Computational Excellence (HEP-CCE) project, we investigate if and how these different programming models may be suitable for experimental HEP workflows through a few representative use cases. One of such use cases is the Liquid Argon Time Projection Chamber (LArTPC) simulation which is essential for LArTPC detector design, validation and data analysis. Following up on our previous investigations of using Kokkos to port LArTPC simulation in the Wire-Cell Toolkit (WCT) to GPUs, we have explored OpenMP and SYCL as potential portable programming models for WCT, with the goal to make diverse computing resources accessible to the LArTPC simulations. In this work, we describe how we utilize relevant features of OpenMP and SYCL for the LArTPC simulation module in WCT. We also show performance benchmark results on multi-core CPUs, NVIDIA and AMD GPUs for both the OpenMP and the SYCL implementations. Comparisons with different compilers will also be given where appropriate.
翻译:计算技术的发展催生了多种硬件架构的普及,不同型号的GPU及其他计算加速器已广泛可用。为在异构计算环境中高效利用这些架构,目前已存在多种支持跨计算系统可移植性与性能的编程模型,如Kokkos、SYCL、OpenMP等。作为高能物理计算卓越中心(HEP-CCE)项目的一部分,我们通过若干代表性用例,研究这些不同编程模型是否适用于实验高能物理工作流程及其适用方式。其中一个典型用例是液氩时间投影室(LArTPC)模拟,该模拟对LArTPC探测器的设计、验证及数据分析至关重要。在先前利用Kokkos将Wire-Cell Toolkit(WCT)中的LArTPC模拟移植至GPU的研究基础上,我们进一步探索了OpenMP和SYCL作为WCT潜在的可移植编程模型,旨在使LArTPC模拟可访问多样化计算资源。本文阐述了如何利用OpenMP和SYCL的相关特性实现WCT中的LArTPC模拟模块,并展示了在多核CPU、NVIDIA及AMD GPU上对OpenMP与SYCL实现方案的性能基准测试结果。同时,文中将酌情提供不同编译器间的对比分析。