The natural gradient method is a central tool for statistical optimisation, but its broader application is hindered by the assumption of a Euclidean parameter space, the repeated estimation of the Fisher information matrix (FIM), and the computational cost of its subsequent inversion. This paper proposes an intrinsic, inversion-free natural gradient method for statistical models whose parameters lie on general Riemannian manifolds. Formulating statistical optimisation in this non-Euclidean setting allows for the natural enforcement of parameter constraints, the elimination of non-identifiable parameters, and the exploitation of geodesic convexity. Our algorithm is based on a moving approximation of the inverse FIM, which is maintained directly on the manifold. This approximation is efficiently updated with new score vectors using low-rank matrix identities. We prove almost-sure convergence rates of $O(\log s / s^α)$ for the sequence of iterates, and a similar rate for the approximate FIM. A limited-memory variant with sub-quadratic storage complexity is further proposed for large-scale applications. We demonstrate the efficacy of our method on variational Bayes within the Bures-Wasserstein manifold, normalising flows on the Stiefel manifold, and reduced-rank logistic regression.
翻译:自然梯度方法是统计优化中的核心工具,但其广泛应用受限于欧几里得参数空间的假设、费舍尔信息矩阵(FIM)的反复估计以及后续求逆的计算成本。本文针对参数位于一般黎曼流形上的统计模型,提出了一种固有的无逆自然梯度方法。在此非欧几里得设定下进行统计优化,可以自然施加参数约束、消除不可辨识参数,并利用测地凸性。我们的算法基于逆FIM的移动近似,该近似直接维护在流形上。通过使用低秩矩阵恒等式,该近似可以利用新的得分向量高效更新。我们证明了迭代序列的几乎必然收敛速率为$O(\log s / s^α)$,同时近似FIM也具有类似速率。针对大规模应用,还提出了一种具有次二次存储复杂度的有限记忆变体。我们通过Bures-Wasserstein流形上的变分贝叶斯、Stiefel流形上的归一化流以及降秩逻辑回归验证了该方法的有效性。