Regent is an implicitly parallel programming language that allows the development of a single codebase for heterogeneous platforms targeting CPUs and GPUs. This paper presents the development of a parallel meshfree solver in Regent for two-dimensional inviscid compressible flows. The meshfree solver is based on the least squares kinetic upwind method. Example codes are presented to show the difference between the Regent and CUDA-C implementations of the meshfree solver on a GPU node. For CPU parallel computations, details are presented on how the data communication and synchronisation are handled by Regent and Fortran+MPI codes. The Regent solver is verified by applying it to the standard test cases for inviscid flows. Benchmark simulations are performed on coarse to very fine point distributions to assess the solver's performance. The computational efficiency of the Regent solver on an A100 GPU is compared with an equivalent meshfree solver written in CUDA-C. The codes are then profiled to investigate the differences in their performance. The performance of the Regent solver on CPU cores is compared with an equivalent explicitly parallel Fortran meshfree solver based on MPI. Scalability results are shown to offer insights into performance.
翻译:Regent是一种隐式并行编程语言,允许针对CPU和GPU异构平台开发单一代码库。本文介绍了基于Regent的二维无粘可压缩流并行无网格求解器的开发。该无网格求解器基于最小二乘动力学上风方法。通过示例代码展示了在GPU节点上Regent与CUDA-C实现的无网格求解器之间的差异。针对CPU并行计算,详细说明了Regent和Fortran+MPI代码如何处理数据通信与同步。通过将Regent求解器应用于无粘流标准测试案例进行验证。在从粗到极细的点分布上进行了基准模拟以评估求解器性能。将A100 GPU上Regent求解器的计算效率与等效的CUDA-C无网格求解器进行了比较,并通过代码剖析研究性能差异。此外,将Regent求解器在CPU核心上的性能与基于MPI的等效显式并行Fortran无网格求解器进行了对比,通过可扩展性结果深入分析性能特征。