Riemannian submanifold optimization with momentum is computationally challenging because, to ensure that the iterates remain on the submanifold, we often need to solve difficult differential equations. Here, we simplify such difficulties for a class of sparse or structured symmetric positive-definite matrices with the affine-invariant metric. We do so by proposing a generalized version of the Riemannian normal coordinates that dynamically orthonormalizes the metric and locally converts the problem into an unconstrained problem in the Euclidean space. We use our approach to simplify existing approaches for structured covariances and develop matrix-inverse-free $2^\text{nd}$-order optimizers for deep learning with low precision by using only matrix multiplications. Code: https://github.com/yorkerlin/StructuredNGD-DL
翻译:黎曼子流形上的动量优化在计算上具有挑战性,因为为了确保迭代点始终位于子流形上,通常需要求解困难的微分方程。本文针对一类具有仿射不变度量的稀疏或结构化对称正定矩阵,简化了此类困难。我们通过提出黎曼法坐标的广义版本,动态地正交化度量,并将问题局部转化为欧氏空间中的无约束问题。利用该方法,我们简化了结构化协方差的现有方法,并仅通过矩阵乘法开发了适用于低精度深度学习的无矩阵逆二阶优化器。代码:https://github.com/yorkerlin/StructuredNGD-DL