Automatic Differentiation (AD) has become a dominant technique in ML. AD frameworks have first been implemented for imperative languages using tapes. Meanwhile, functional implementations of AD have been developed, often based on dual numbers, which are close to the formal specification of differentiation and hence easier to prove correct. But these papers have focussed on correctness not efficiency. Recently, it was shown how an approach using dual numbers could be made efficient through the right optimizations. Optimizations are highly dependent on order, as one optimization can enable another. It can therefore be useful to have fine-grained control over the scheduling of optimizations. One method expresses compiler optimizations as rewrite rules, whose application can be combined and controlled using strategy languages. Previous work describes the use of term rewriting and strategies to generate high-performance code in a compiler for a functional language. In this work, we implement dual numbers AD in a functional array programming language using rewrite rules and strategy combinators for optimization. We aim to combine the elegance of differentiation using dual numbers with a succinct expression of the optimization schedule using a strategy language. We give preliminary evidence suggesting the viability of the approach on a micro-benchmark.
翻译:自动微分已成为机器学习中的主导技术。自动微分框架最初通过磁带机制在命令式语言中实现。与此同时,基于对偶数的函数式自动微分实现也已发展起来,这类实现更接近微分的形式化规范,因此更容易验证正确性。然而,这些研究主要关注正确性而非效率。近期研究表明,通过适当的优化可以使基于对偶数的自动微分方法达到高效。由于一个优化可能触发另一个优化,优化过程高度依赖执行顺序。因此,对优化调度进行细粒度控制显得尤为重要。一种方法是将编译器优化表达为重写规则,并通过策略语言组合和控制其应用。先前工作描述了在函数式语言编译器中利用项重写与策略生成高性能代码的方法。本研究在函数式数组编程语言中,通过重写规则与策略组合子实现基于对偶数的自动微分优化。我们旨在将对偶数微分的优雅性与策略语言对优化调度的简洁表达相结合。初步实验表明,该方法在微型基准测试中具有可行性。