The C++ programming language provides classes and structs as fundamental modeling entities. Consequently, C++ code tends to favour array-of-structs (AoS) for encoding data sequences, even though structure-of-arrays (SoA) yields better performance for some calculations. We propose a C++ language extension based on attributes that allows developers to guide the compiler in selecting memory arrangements, i.e.~to select the optimal choice between AoS and SoA dynamically depending on both the execution context and algorithm step. The compiler can then automatically convert data into the preferred format prior to the calculations and convert results back afterward. The compiler handles all the complexity of determining which data to convert and how to manage data transformations. Our implementation realises the compiler-extension for the new annotations in Clang and demonstrates their effectiveness through a smoothed particle hydrodynamics (SPH) code, which we evaluate on an Intel CPU, an ARM CPU, and a Grace-Hopper GPU. While the separation of concerns between data structure and operators is elegant and provides performance improvements, the new annotations do not eliminate the need for performance engineering. Instead, they challenge conventional performance wisdom and necessitate rethinking approaches how to write efficient implementations.
翻译:C++编程语言将类和结构体作为基本建模实体。因此,C++代码倾向于使用结构体数组(AoS)来编码数据序列,尽管在某些计算中数组结构体(SoA)能提供更好的性能。我们提出了一种基于属性的C++语言扩展,允许开发者引导编译器选择内存布局,即根据执行上下文和算法步骤动态选择AoS与SoA之间的最优方案。随后,编译器可在计算前自动将数据转换为优选格式,并在计算后将结果转换回原格式。编译器负责处理确定转换哪些数据以及如何管理数据转换的所有复杂性。我们的实现在Clang中实现了对新注解的编译器扩展,并通过一个光滑粒子流体动力学(SPH)代码验证了其有效性,该代码在Intel CPU、ARM CPU和Grace-Hopper GPU上进行了评估。虽然数据结构与操作符的关注点分离具有优雅性并能带来性能提升,但新注解并未消除性能工程的需求。相反,它们对传统的性能优化理念提出了挑战,并促使我们重新思考如何编写高效实现的方法。