We consider decentralized optimization problems in which a number of agents collaborate to minimize the average of their local functions by exchanging over an underlying communication graph. Specifically, we place ourselves in an asynchronous model where only a random portion of nodes perform computation at each iteration, while the information exchange can be conducted between all the nodes and in an asymmetric fashion. For this setting, we propose an algorithm that combines gradient tracking with a network-level variance reduction (in contrast to variance reduction within each node). This enables each node to track the average of the gradients of the objective functions. Our theoretical analysis shows that the algorithm converges linearly, when the local objective functions are strongly convex, under mild connectivity conditions on the expected mixing matrices. In particular, our result does not require the mixing matrices to be doubly stochastic. In the experiments, we investigate a broadcast mechanism that transmits information from computing nodes to their neighbors, and confirm the linear convergence of our method on both synthetic and real-world datasets.
翻译:我们研究了去中心化优化问题,其中多个智能体通过在底层通信图上交换信息,协作最小化各自局部函数的平均值。具体而言,我们考虑异步模型:每次迭代中,仅随机一部分节点执行计算,而所有节点之间可以进行不对称的信息交换。针对这一场景,我们提出了一种算法,该算法将梯度跟踪与网络级别的方差缩减(与每个节点内部的方差缩减不同)相结合。这使得每个节点能够跟踪目标函数梯度的平均值。我们的理论分析表明,在期望混合矩阵满足温和连通性条件的情况下,当局部目标函数是强凸时,算法线性收敛。特别地,我们的结果不需要混合矩阵是双随机的。在实验中,我们研究了一种将信息从计算节点广播到其邻居的机制,并在合成数据集和真实数据集上验证了我们方法的线性收敛性。