The natural gradient method is widely used in statistical optimization, but its standard formulation assumes a Euclidean parameter space. This paper proposes an inversion-free stochastic natural gradient method for probability distributions whose parameters lie on a Riemannian manifold. The manifold setting offers several advantages: one can implicitly enforce parameter constraints such as positive definiteness and orthogonality, ensure parameters are identifiable, or guarantee regularity properties of the objective like geodesic convexity. Building on an intrinsic formulation of the Fisher information matrix (FIM) on a manifold, our method maintains an online approximation of the inverse FIM, which is efficiently updated at quadratic cost using score vectors sampled at successive iterates. In the Riemannian setting, these score vectors belong to different tangent spaces and must be combined using transport operations. We prove almost-sure convergence rates of $O(\log{s}/s^α)$ for the squared distance to the minimizer when the step size exponent $α>2/3$. We also establish almost-sure rates for the approximate FIM, which now accumulates transport-based errors. A limited-memory variant of the algorithm with sub-quadratic storage complexity is proposed. Finally, we demonstrate the effectiveness of our method relative to its Euclidean counterparts on variational Bayes with Gaussian approximations and normalizing flows.
翻译:自然梯度方法在统计优化中应用广泛,但标准形式假设参数空间为欧几里得空间。本文针对概率分布参数位于黎曼流形上的情形,提出一种无逆运算的随机自然梯度方法。流形框架具有多重优势:可隐式施加正定性、正交性等参数约束,确保参数可辨识性,或保证目标函数的测地凸性等正则性质。基于流形上Fisher信息矩阵的内在形式,本方法维持对逆FIM的在线近似,通过连续迭代点采样的得分向量以二次复杂度高效更新。在黎曼设定下,这些得分向量分属不同切空间,需通过传输操作进行组合。我们证明了步长指数α>2/3时,到最小化器的平方距离具有O(log s/s^α)的几乎必然收敛速率,同时建立了含传输误差的近似FIM的几乎必然收敛速率。提出存储复杂度低于二次的有限内存变体算法。最后,通过高斯近似变分贝叶斯与标准化流的实验,验证了本方法相较欧几里得对应方法的有效性。