The conjugate gradient method is a crucial first-order optimization method that generally converges faster than the steepest descent method, and its computational cost is much lower than the second-order methods. However, while various types of conjugate gradient methods have been studied in Euclidean spaces and on Riemannian manifolds, there has little study for those in distributed scenarios. This paper proposes a decentralized Riemannian conjugate gradient descent (DRCGD) method that aims at minimizing a global function over the Stiefel manifold. The optimization problem is distributed among a network of agents, where each agent is associated with a local function, and communication between agents occurs over an undirected connected graph. Since the Stiefel manifold is a non-convex set, a global function is represented as a finite sum of possibly non-convex (but smooth) local functions. The proposed method is free from expensive Riemannian geometric operations such as retractions, exponential maps, and vector transports, thereby reducing the computational complexity required by each agent. To the best of our knowledge, DRCGD is the first decentralized Riemannian conjugate gradient algorithm to achieve global convergence over the Stiefel manifold.
翻译:共轭梯度方法是一种重要的一阶优化方法,通常比最速下降法收敛更快,且计算成本远低于二阶方法。然而,尽管在欧氏空间和黎曼流形上已研究了多种共轭梯度方法,但在分布式场景下的研究尚显不足。本文提出了一种去中心化黎曼共轭梯度下降(DRCGD)方法,旨在最小化定义在Stiefel流形上的全局函数。该优化问题分布在一个智能体网络中,每个智能体关联一个局部函数,智能体之间通过无向连通图进行通信。由于Stiefel流形是非凸集,全局函数表示为若干可能非凸(但光滑)局部函数的有限和。所提方法避免了昂贵的黎曼几何运算,如回缩、指数映射和向量传输,从而降低了每个智能体的计算复杂度。据我们所知,DRCGD是首个在Stiefel流形上实现全局收敛的去中心化黎曼共轭梯度算法。