Automatic source-to-source parallelization of serial code for shared and distributed memory systems is a challenging task in high-performance computing. While many attempts were made to translate serial code into parallel code for a shared memory environment (usually using OpenMP), none has managed to do so for a distributed memory environment. In this paper, we propose a novel approach, called MPI-rical, for automated MPI code generation using a transformer-based model trained on approximately 25,000 serial code snippets and their corresponding parallelized MPI code out of more than 50,000 code snippets in our corpus (MPICodeCorpus). To evaluate the performance of the model, we first break down the serial code to MPI-based parallel code translation problem into two sub-problems and develop two research objectives: code completion defined as given a location in the source code, predict the MPI function for that location, and code translation defined as predicting an MPI function as well as its location in the source code. We evaluate MPI-rical on MPICodeCorpus dataset and on real-world scientific code benchmarks and compare its performance between the code completion and translation tasks. Our experimental results show that while MPI-rical performs better on the code completion task than the code translation task, the latter is better suited for real-world programming assistance, in which the tool suggests the need for an MPI function regardless of prior knowledge. Overall, our approach represents a significant step forward in automating the parallelization of serial code for distributed memory systems, which can save valuable time and resources for software developers and researchers. The source code used in this work, as well as other relevant sources, are available at: https://github.com/Scientific-Computing-Lab-NRCN/MPI-rical
翻译:针对共享内存和分布式内存系统的串行代码自动源到源并行化,是高性能计算领域的一项艰巨任务。尽管已有诸多尝试将串行代码转化为共享内存环境下的并行代码(通常使用OpenMP),但尚未有方法成功实现分布式内存环境下的这一转化。本文提出了一种名为MPI-rical的新方法,通过基于Transformer的模型实现自动化MPI代码生成。该模型在我们语料库(MPICodeCorpus)中超过50,000个代码片段中选取约25,000个串行代码片段及其对应的并行化MPI代码进行训练。为评估模型性能,我们首先将基于MPI的串行代码到并行代码的翻译问题分解为两个子问题,并制定两个研究目标:代码补全(定义为给定源码位置,预测该位置对应的MPI函数)和代码翻译(定义为同时预测MPI函数及其在源码中的位置)。我们在MPICodeCorpus数据集及真实科学代码基准上评估了MPI-rical,并比较了其在代码补全与代码翻译任务上的性能。实验结果表明,尽管MPI-rical在代码补全任务上的表现优于代码翻译任务,但后者更适用于实际编程辅助场景——无需先验知识即可自动提示需要插入MPI函数的位置。总体而言,我们的方法在实现分布式内存系统串行代码自动并行化方面迈出了重要一步,可为软件开发者与研究者节省宝贵的时间与资源。本工作所用源代码及其他相关资源可通过以下链接获取:https://github.com/Scientific-Computing-Lab-NRCN/MPI-rical