The large variety of production implementations of the message passing interface (MPI) each provide unique and varying underlying algorithms. Each emerging supercomputer supports one or a small number of system MPI installations, tuned for the given architecture. Performance varies with MPI version, but application programmers are typically unable to achieve optimal performance with local MPI installations and therefore rely on whichever implementation is provided as a system install. This paper presents MPI Advance, a collection of libraries that sit on top of MPI, optimizing the underlying performance of any existing MPI library. The libraries provide optimizations for collectives, neighborhood collectives, partitioned communication, and GPU-aware communication.
翻译:消息传递接口(MPI)的生产级实现种类繁多,每种实现都提供独特且各异的底层算法。每台新兴超级计算机支持一种或少数几种针对特定架构进行调优的系统级MPI安装版本。性能随MPI版本而异,但应用程序开发者通常无法通过本地MPI安装获得最优性能,因此只能依赖于系统预装版本。本文提出MPI Advance——一个构建于MPI之上的库集合,旨在优化任意现有MPI库的底层性能。该库集提供针对集合通信、邻域集合通信、分区通信以及GPU感知通信的优化方案。