We study first-order optimization algorithms for computing the barycenter of Gaussian distributions with respect to the optimal transport metric. Although the objective is geodesically non-convex, Riemannian GD empirically converges rapidly, in fact faster than off-the-shelf methods such as Euclidean GD and SDP solvers. This stands in stark contrast to the best-known theoretical results for Riemannian GD, which depend exponentially on the dimension. In this work, we prove new geodesic convexity results which provide stronger control of the iterates, yielding a dimension-free convergence rate. Our techniques also enable the analysis of two related notions of averaging, the entropically-regularized barycenter and the geometric median, providing the first convergence guarantees for Riemannian GD for these problems.
翻译:本文研究用于计算高斯分布关于最优传输度量重心的首阶优化算法。尽管目标函数是测地非凸的,但黎曼梯度下降(Riemannian GD)在经验上快速收敛,实际上比欧几里得梯度下降(Euclidean GD)和半定规划求解器等现成方法更快。这与黎曼梯度下降已知的最佳理论结果形成鲜明对比,后者依赖于维度的指数级增长。在本工作中,我们证明了新的测地凸性结果,从而对迭代点提供更强的控制,实现了无维度收敛速率。我们的技术还可用于分析两种相关的平均概念——熵正则化重心和几何中位数,首次为黎曼梯度下降在这些问题上的收敛性提供了保证。