One-sided communication is a useful paradigm for irregular parallel applications, but most one-sided programming environments, including MPI's one-sided interface and PGAS programming languages, lack application level libraries to support these applications. We present the Berkeley Container Library, a set of generic, cross-platform, high-performance data structures for irregular applications, including queues, hash tables, Bloom filters and more. BCL is written in C++ using an internal DSL called the BCL Core that provides one-sided communication primitives such as remote get and remote put operations. The BCL Core has backends for MPI, OpenSHMEM, GASNet-EX, and UPC++, allowing BCL data structures to be used natively in programs written using any of these programming environments. Along with our internal DSL, we present the BCL ObjectContainer abstraction, which allows BCL data structures to transparently serialize complex data types while maintaining efficiency for primitive types. We also introduce the set of BCL data structures and evaluate their performance across a number of high-performance computing systems, demonstrating that BCL programs are competitive with hand-optimized code, even while hiding many of the underlying details of message aggregation, serialization, and synchronization.
翻译:单边通信是不规则并行应用中一种有用的范式,但大多数单边编程环境(包括MPI的单边接口和PGAS编程语言)缺乏支持这些应用的应用程序级库。我们提出伯克利容器库(Berkeley Container Library),这是一套面向不规则应用的通用、跨平台、高性能数据结构,包括队列、哈希表、布隆过滤器等。BCL使用C++编写,其内部采用一种名为BCL Core的领域特定语言(DSL),提供远程获取和远程放置等单边通信原语。BCL Core支持MPI、OpenSHMEM、GASNet-EX和UPC++作为后端,使得BCL数据结构能够原生用于基于这些编程环境编写的程序中。除内部DSL外,我们还提出了BCL ObjectContainer抽象机制,允许BCL数据结构透明地序列化复杂数据类型,同时对原始类型保持高效性。我们还介绍了BCL数据结构的集合,并在多个高性能计算系统上评估其性能,结果表明BCL程序能与手写优化代码相媲美,同时隐藏了消息聚合、序列化和同步的大量底层细节。