Automatic Differentiation (AD) has become a dominant technique in ML. AD frameworks have first been implemented for imperative languages using tapes. Meanwhile, functional implementations of AD have been developed, often based on dual numbers, which are close to the formal specification of differentiation and hence easier to prove correct. But these papers have focussed on correctness not efficiency. Recently, it was shown how an approach using dual numbers could be made efficient through the right optimizations. Optimizations are highly dependent on order, as one optimization can enable another. It can therefore be useful to have fine-grained control over the scheduling of optimizations. One method expresses compiler optimizations as rewrite rules, whose application can be combined and controlled using strategy languages. Previous work describes the use of term rewriting and strategies to generate high-performance code in a compiler for a functional language. In this work, we implement dual numbers AD in a functional array programming language using rewrite rules and strategy combinators for optimization. We aim to combine the elegance of differentiation using dual numbers with a succinct expression of the optimization schedule using a strategy language. We give preliminary evidence suggesting the viability of the approach on a micro-benchmark.
翻译:自动微分(AD)已成为机器学习中的主流技术。AD框架最初基于磁带技术为命令式语言实现。与此同时,基于对偶数的函数式AD实现也得到了发展,这类实现更贴近微分的形式化规范,因此更容易验证其正确性。但此前相关研究主要关注正确性而非效率。近期研究表明,通过恰当的优化措施,基于对偶数的方法可实现高效性。由于优化之间存在高度依赖关系(一项优化可能为另一项优化创造条件),因此精细控制优化调度过程具有重要价值。一种方法是将编译器优化表达为重写规则,并通过策略语言组合与控制这些规则的应用。先前工作描述了在函数式语言编译器中利用项重写与策略生成高性能代码的方法。本文在函数式数组编程语言中,通过重写规则与策略组合子实现基于对偶数的自动微分优化。我们旨在将对偶数微分的简洁优雅性与使用策略语言表述优化调度的精炼性相结合。基于微基准测试的初步实验结果表明了该方法的可行性。