Federated learning (FL) is a distributed machine learning approach that enables multiple local clients and a central server to collaboratively train a model while keeping the data on their own devices. First-order methods, particularly those incorporating variance reduction techniques, are the most widely used FL algorithms due to their simple implementation and stable performance. However, these methods tend to be slow and require a large number of communication rounds to reach the global minimizer. We propose FedOSAA, a novel approach that preserves the simplicity of first-order methods while achieving the rapid convergence typically associated with second-order methods. Our approach applies one Anderson acceleration (AA) step following classical local updates based on first-order methods with variance reduction, such as FedSVRG and SCAFFOLD, during local training. This AA step is able to leverage curvature information from the history points and gives a new update that approximates the Newton-GMRES direction, thereby significantly improving the convergence. We establish a local linear convergence rate to the global minimizer of FedOSAA for smooth and strongly convex loss functions. Numerical comparisons show that FedOSAA substantially improves the communication and computation efficiency of the original first-order methods, achieving performance comparable to second-order methods like GIANT.
翻译:联邦学习(FL)是一种分布式机器学习方法,允许多个本地客户端与中央服务器协作训练模型,同时将数据保留在各自的设备上。一阶方法,特别是那些结合了方差缩减技术的方法,因其实现简单且性能稳定,成为目前应用最广泛的联邦学习算法。然而,这些方法往往收敛速度较慢,需要大量的通信轮次才能达到全局极小值点。我们提出了FedOSAA,这是一种新颖的方法,它保留了一阶方法的简洁性,同时实现了通常与二阶方法相关的快速收敛。我们的方法在基于方差缩减的一阶方法(如FedSVRG和SCAFFOLD)进行经典的本地更新后,在本地训练期间应用一个单步安德森加速(AA)。这个AA步骤能够利用历史点中的曲率信息,并给出一个近似于牛顿-GMRES方向的新更新量,从而显著提高了收敛速度。我们针对光滑且强凸的损失函数,建立了FedOSAA到全局极小值点的局部线性收敛速率。数值比较表明,FedOSAA显著提高了原始一阶方法的通信和计算效率,达到了与GIANT等二阶方法相当的性能。