We describe an exact algorithm to solve linear systems of the form $Hx=b$ where $H$ is the Hessian of a deep net. The method computes Hessian-inverse-vector products without storing the Hessian or its inverse. It requires time and storage that scale linearly in the number of layers. This is in contrast to the naive approach of first computing the Hessian, then solving the linear system, which takes storage and time that are respectively quadratic and cubic in the number of layers. The Hessian-inverse-vector product method scales roughly like Pearlmutter's algorithm for computing Hessian-vector products.
翻译:我们描述了一种精确算法,用于求解形式为 $Hx=b$ 的线性系统,其中 $H$ 是一个深度网络的海森矩阵。该方法无需存储海森矩阵或其逆矩阵,即可计算海森逆矩阵与向量的乘积。其所需的时间和存储空间随网络层数线性增长。这与先计算海森矩阵再求解线性系统的朴素方法形成对比,后者的存储和时间复杂度分别随层数呈二次和三次增长。该海森逆矩阵-向量乘积方法的计算复杂度大致类似于 Pearlmutter 用于计算海森矩阵-向量乘积的算法。