The application of operator overloading algorithmic differentiation (AD) to computer programs in order to compute the derivative is quite common. But, the replacement of the underlying computational floating point type with the specialized type of an AD tool has two problems. First, the memory structure of the program is changed and floating-point data is interleaved with identifiers from AD. This prevents the compiler from performing optimizations such as SIMD optimizations. Second, the AD tool does not see any domain-specific operations, e.,g. linear algebra operations, that the program uses. This prevents the AD tool from using specialized algorithms in such places. We propose a new AD tool that is tailored to such situations. The memory structure of the primal data is retained by associating an identifier with each entity, e.,g. matrix, and not with each floating point value, e.,g. element of the matrix. Operations on such entities can then be annotated and a generator is used to create the AD overloads. We demonstrate that this approach provides performance comparable to that of other specializations. In addition, the run-time factor is below the theoretical 4.5 of reverse AD for programs that are written purely with linear algebra entities and operations.
翻译:利用运算符重载算法微分(AD)计算计算机程序导数的方法已相当普遍。然而,将底层计算浮点类型替换为AD工具专用类型存在两个问题。首先,程序的内存结构被改变,浮点数据与AD标识符交错存储,这阻碍了编译器执行SIMD优化等优化操作。其次,AD工具无法识别程序使用的领域特定操作(例如线性代数运算),导致无法在这些位置应用专用算法。我们提出了一种针对此类场景定制的新型AD工具。该工具通过为每个实体(例如矩阵)而非每个浮点值(例如矩阵元素)关联标识符,从而保留原始数据的内存结构。此类实体的操作可进行标注,并通过生成器创建AD重载。实验证明,该方法能达到与其他专用方案相当的性能水平。此外,对于完全采用线性代数实体和操作编写的程序,其运行时系数低于反向AD理论值4.5。