The progression of communication in the Message Passing Interface (MPI) is not well defined, yet it is critical for application performance, particularly in achieving effective computation and communication overlap. The opaque nature of MPI progress poses significant challenges in advancing MPI within modern high-performance computing (HPC) practices. Firstly, the lack of clarity hinders the development of explicit guidelines for enhancing computation and communication overlap in applications. Secondly, it prevents MPI from seamlessly integrating with contemporary programming paradigms, such as task-based runtimes and event-driven programming. Thirdly, it limits the extension of MPI functionalities from the user space. In this paper, we examine the role of MPI progress by analyzing the implementation details of MPI messaging. We then generalize the asynchronous communication pattern and identify key factors influencing application performance. Based on this analysis, we propose a set of MPI extensions designed to enable users to explicitly construct and manage an efficient progress engine. We provide example codes to demonstrate the use of these proposed APIs in achieving improved performance, adapting MPI to task-based or event-driven programming styles, and constructing collective algorithms that rival the performance of native implementations. Our approach is compared to previous efforts in the field, highlighting its reduced complexity and increased effectiveness.
翻译:消息传递接口(MPI)中的通信进展机制尚未得到明确定义,但其对应用程序性能至关重要,特别是在实现高效计算与通信重叠方面。MPI进展机制的不透明性给其在现代高性能计算(HPC)实践中的发展带来了重大挑战。首先,这种不明确性阻碍了制定明确的应用程序计算与通信重叠优化指南。其次,它使得MPI难以与当代编程范式(如基于任务的运行时系统和事件驱动编程)无缝集成。第三,它限制了从用户空间扩展MPI功能的可能性。本文通过分析MPI消息传递的实现细节,深入探讨了MPI进展机制的作用。我们进而归纳了异步通信模式,并识别出影响应用程序性能的关键因素。基于此分析,我们提出了一组MPI扩展方案,旨在使用户能够显式构建和管理高效的进展引擎。我们提供了示例代码,展示如何利用这些提议的API实现性能提升、使MPI适应基于任务或事件驱动的编程风格,以及构建性能可与原生实现媲美的集合通信算法。通过与领域内已有研究进行对比,我们的方法在降低复杂度和提升效能方面展现出显著优势。