The use of neural networks for solving differential equations is practically difficult due to the exponentially increasing runtime of autodifferentiation when computing high-order derivatives. We propose $n$-TangentProp, the natural extension of the TangentProp formalism \cite{simard1991tangent} to arbitrarily many derivatives. $n$-TangentProp computes the exact derivative $d^n/dx^n f(x)$ in quasilinear, instead of exponential time, for a densely connected, feed-forward neural network $f$ with a smooth, parameter-free activation function. We validate our algorithm empirically across a range of depths, widths, and number of derivatives. We demonstrate that our method is particularly beneficial in the context of physics-informed neural networks where \ntp allows for significantly faster training times than previous methods and has favorable scaling with respect to both model size and loss-function complexity as measured by the number of required derivatives. The code for this paper can be found at https://github.com/kyrochi/n\_tangentprop.
翻译:由于计算高阶导数时自动微分运行时间呈指数级增长,使用神经网络求解微分方程在实际应用中面临困难。我们提出$n$-TangentProp方法,这是TangentProp形式体系\cite{simard1991tangent}向任意阶导数的自然扩展。对于具有光滑无参数激活函数的全连接前馈神经网络$f$,$n$-TangentProp能以拟线性时间(而非指数时间)精确计算导数$d^n/dx^n f(x)$。我们通过不同网络深度、宽度及导数阶数的实验验证了算法有效性。研究表明,在物理信息神经网络的应用场景中,本方法相较于现有技术能显著缩短训练时间,且在模型规模和损失函数复杂度(以所需导数阶数为度量)方面均表现出良好的可扩展性。本文代码详见https://github.com/kyrochi/n\_tangentprop。