Riemannian submanifold optimization with momentum is computationally challenging because, to ensure that the iterates remain on the submanifold, we often need to solve difficult differential equations. Here, we simplify such difficulties for a class of sparse or structured symmetric positive-definite matrices with the affine-invariant metric. We do so by proposing a generalized version of the Riemannian normal coordinates that dynamically orthonormalizes the metric and locally converts the problem into an unconstrained problem in the Euclidean space. We use our approach to simplify existing approaches for structured covariances and develop matrix-inverse-free $2^\text{nd}$-order optimizers for deep learning with low precision by using only matrix multiplications. Code: https://github.com/yorkerlin/StructuredNGD-DL
翻译:带动量的黎曼子流形优化在计算上具有挑战性,因为为确保迭代点保持在子流形上,通常需要求解复杂的微分方程。本文针对一类稀疏或具有结构对称正定矩阵,在仿射不变度量下简化了此类困难。我们提出黎曼正规坐标的广义版本,该坐标动态正交化度量,并将问题局部转化为欧氏空间中的无约束问题。利用该方法,我们简化了现有结构化协方差的处理方法,并仅通过矩阵乘法为低精度深度学习开发了无需矩阵求逆的二阶优化器。代码:https://github.com/yorkerlin/StructuredNGD-DL