MPI Progress For All - 专知论文

The progression of communication in the Message Passing Interface (MPI) is not well defined, yet it is critical for application performance, particularly in achieving effective computation and communication overlap. The opaque nature of MPI progress poses significant challenges in advancing MPI within modern high-performance computing (HPC) practices. Firstly, the lack of clarity hinders the development of explicit guidelines for enhancing computation and communication overlap in applications. Secondly, it prevents MPI from seamlessly integrating with contemporary programming paradigms, such as task-based runtimes and event-driven programming. Thirdly, it limits the extension of MPI functionalities from the user space. In this paper, we examine the role of MPI progress by analyzing the implementation details of MPI messaging. We then generalize the asynchronous communication pattern and identify key factors influencing application performance. Based on this analysis, we propose a set of MPI extensions designed to enable users to explicitly construct and manage an efficient progress engine. We provide example codes to demonstrate the use of these proposed APIs in achieving improved performance, adapting MPI to task-based or event-driven programming styles, and constructing collective algorithms that rival the performance of native implementations. Our approach is compared to previous efforts in the field, highlighting its reduced complexity and increased effectiveness.

翻译：消息传递接口（MPI）中的通信进展机制尚未得到明确定义，但其对应用程序性能至关重要，特别是在实现高效计算与通信重叠方面。MPI进展机制的不透明性给现代高性能计算（HPC）实践中MPI的进一步发展带来了重大挑战。首先，这种不明确性阻碍了制定明确的应用程序计算-通信重叠优化准则。其次，它使MPI难以与当代编程范式（如基于任务的运行时系统和事件驱动编程）无缝集成。第三，它限制了从用户空间扩展MPI功能的可能性。本文通过分析MPI消息传递的实现细节，深入探讨了MPI进展机制的作用。我们进一步归纳了异步通信模式，并识别出影响应用程序性能的关键因素。基于此分析，我们提出了一组MPI扩展方案，旨在使用户能够显式构建并管理高效的进展引擎。我们提供了示例代码，展示如何通过这些新设计的API实现性能提升、使MPI适应基于任务或事件驱动的编程范式，以及构建性能媲美原生实现的集合通信算法。与领域内已有方案相比，我们的方法在降低复杂度和提升效能方面展现出显著优势。

相关内容

Performance

关注 3

Performance：International Symposium on Computer Performance Modeling, Measurements and Evaluation。 Explanation：计算机性能建模、测量和评估国际研讨会。 Publisher：ACM。 SIT：http://dblp.uni-trier.de/db/conf/performance/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日