Automatic differentiation is everywhere, but there exists only minimal documentation of how it works in complex arithmetic beyond stating "derivatives in $\mathbb{C}^d$" $\cong$ "derivatives in $\mathbb{R}^{2d}$" and, at best, shallow references to Wirtinger calculus. Unfortunately, the equivalence $\mathbb{C}^d \cong \mathbb{R}^{2d}$ becomes insufficient as soon as we need to derive custom gradient rules, e.g., to avoid differentiating "through" expensive linear algebra functions or differential equation simulators. To combat such a lack of documentation, this article surveys forward- and reverse-mode automatic differentiation with complex numbers, covering topics such as Wirtinger derivatives, a modified chain rule, and different gradient conventions while explicitly avoiding holomorphicity and the Cauchy--Riemann equations (which would be far too restrictive). To be precise, we will derive, explain, and implement a complex version of Jacobian-vector and vector-Jacobian products almost entirely with linear algebra without relying on complex analysis or differential geometry. This tutorial is a call to action, for users and developers alike, to take complex values seriously when implementing custom gradient propagation rules -- the manuscript explains how.
翻译:自动微分无处不在,但关于其在复数运算中的工作原理,现有文档大多仅止步于声明“$\mathbb{C}^d$ 中的导数” $\cong$ “$\mathbb{R}^{2d}$ 中的导数”,至多浅显提及 Wirtinger 微积分。遗憾的是,一旦我们需要推导自定义梯度规则(例如为避免对昂贵的线性代数函数或微分方程模拟器进行“贯穿”微分),$\mathbb{C}^d \cong \mathbb{R}^{2d}$ 的等价关系便不再充分。为弥补此类文档缺失,本文系统综述了复数域的前向与反向模式自动微分,涵盖 Wirtinger 导数、修正链式法则及不同梯度约定等主题,同时明确避开全纯性与柯西-黎曼方程(这些条件限制性过强)。具体而言,我们将几乎完全基于线性代数推导、解释并实现复数版本的雅可比-向量积与向量-雅可比积,而不依赖复分析或微分几何。本教程旨在呼吁用户与开发者同等重视在实现自定义梯度传播规则时对复数值的严谨处理——本文详细阐述了具体方法。