Understanding How Consistency Works in Federated Learning via Stage-wise Relaxed Initialization

Federated learning (FL) is a distributed paradigm that coordinates massive local clients to collaboratively train a global model via stage-wise local training processes on the heterogeneous dataset. Previous works have implicitly studied that FL suffers from the ``client-drift'' problem, which is caused by the inconsistent optimum across local clients. However, till now it still lacks solid theoretical analysis to explain the impact of this local inconsistency. To alleviate the negative impact of the ``client drift'' and explore its substance in FL, in this paper, we first design an efficient FL algorithm \textit{FedInit}, which allows employing the personalized relaxed initialization state at the beginning of each local training stage. Specifically, \textit{FedInit} initializes the local state by moving away from the current global state towards the reverse direction of the latest local state. This relaxed initialization helps to revise the local divergence and enhance the local consistency level. Moreover, to further understand how inconsistency disrupts performance in FL, we introduce the excess risk analysis and study the divergence term to investigate the test error of the proposed \textit{FedInit} method. Our studies show that optimization error is not sensitive to this local inconsistency, while it mainly affects the generalization error bound in \textit{FedInit}. Extensive experiments are conducted to validate this conclusion. Our proposed \textit{FedInit} could achieve state-of-the-art~(SOTA) results compared to several advanced benchmarks without any additional costs. Meanwhile, stage-wise relaxed initialization could also be incorporated into the current advanced algorithms to achieve higher performance in the FL paradigm.

翻译：联邦学习（FL）是一种分布式范式，通过异构数据集上的逐阶段本地训练过程，协调海量本地客户端协作训练全局模型。已有工作隐含地研究了FL面临的"客户端漂移"问题，该问题源于各本地客户端间最优解的不一致。然而，目前仍缺乏严格的理论分析来解释这种本地不一致性的影响。为缓解"客户端漂移"的负面影响并探究其在FL中的本质，本文首先设计了一种高效的FL算法\textit{FedInit}，该算法允许在每个本地训练阶段开始时采用个性化的松弛初始化状态。具体而言，\textit{FedInit}通过使本地状态偏离当前全局状态、沿最新本地状态的反方向进行初始化。这种松弛初始化有助于修正本地发散性并提升本地一致性水平。此外，为深入理解不一致性如何破坏FL性能，我们引入超额风险分析，通过研究发散项来探究所提\textit{FedInit}方法的测试误差。研究表明：优化误差对本地不一致性并不敏感，而该不一致性主要影响\textit{FedInit}的泛化误差界。大量实验验证了这一结论。与多个先进基准方法相比，我们提出的\textit{FedInit}无需额外成本即可取得最先进（SOTA）结果。同时，逐阶段松弛初始化也可融入现有先进算法中，以在FL范式下实现更高性能。