We present a linear algebra formulation of backpropagation which allows the calculation of gradients by using a generically written ``backslash'' or Gaussian elimination on triangular systems of equations. Generally, the matrix elements are operators. This paper has three contributions: (i) it is of intellectual value to replace traditional treatments of automatic differentiation with a (left acting) operator theoretic, graph-based approach; (ii) operators can be readily placed in matrices in software in programming languages such as Julia as an implementation option; (iii) we introduce a novel notation, ``transpose dot'' operator ``$\{\}^{T_\bullet}$'' that allows for the reversal of operators. We further demonstrate the elegance of the operators approach in a suitable programming language consisting of generic linear algebra operators such as Julia \cite{bezanson2017julia}, and that it is possible to realize this abstraction in code. Our implementation shows how generic linear algebra can allow operators as elements of matrices. In contrast to ``operator overloading,'' where backslash would normally have to be rewritten to take advantage of operators, with ``generic programming'' there is no such need.
翻译:我们提出了一种反向传播的线性代数形式,通过使用通用编写的“反斜杠”或高斯消元法求解三角方程组来计算梯度。通常,矩阵元素为算子。本文有三项贡献:(i) 用(左作用)算子论、基于图的方法替代自动微分的传统处理方式具有学术价值;(ii) 在Julia等编程语言中,算子可轻松嵌入矩阵作为实现选项;(iii) 我们引入了一种新颖的符号——“转置点”算子“$\{\}^{T_\bullet}$”,用于实现算子的反转。我们进一步在包含通用线性代数算子(如Julia \cite{bezanson2017julia})的合适编程语言中展示了算子方法的优雅性,并验证了这种抽象在代码中实现的可能性。我们的实现展示了通用线性代数如何使算子成为矩阵元素。与通常需要重写反斜杠以利用算子的“运算符重载”不同,在“泛型编程”中无需此类操作。