In the context of Federated Learning with heterogeneous data environments, local models tend to converge to their own local model optima during local training steps, deviating from the overall data distributions. Aggregation of these local updates, e.g., with FedAvg, often does not align with the global model optimum (client drift), resulting in an update that is suboptimal for most clients. Personalized Federated Learning approaches address this challenge by exclusively focusing on the average local performances of clients' models on their own data distribution. Generalization to out-of-distribution samples, which is a substantial benefit of FedAvg and represents a significant component of robustness, appears to be inadequately incorporated into the assessment and evaluation processes. This study involves a thorough evaluation of Federated Learning approaches, encompassing both their local performance and their generalization capabilities. Therefore, we examine different stages within a single communication round to enable a more nuanced understanding of the considered metrics. Furthermore, we propose and incorporate a modified approach of FedAvg, designated as Federated Learning with Individualized Updates (FLIU), extending the algorithm by a straightforward individualization step with an adaptive personalization factor. We evaluate and compare the approaches empirically using MNIST and CIFAR-10 under various distributional conditions, including benchmark IID and pathological non-IID, as well as additional novel test environments with Dirichlet distribution specifically developed to stress the algorithms on complex data heterogeneity.
翻译:在异构数据环境的联邦学习背景下,本地模型在本地训练步骤中倾向于收敛至其自身的局部最优解,从而偏离整体数据分布。这些本地更新的聚合(例如使用FedAvg)通常无法与全局模型最优解对齐(客户端漂移),导致对多数客户端而言更新结果次优。个性化联邦学习方法通过专注于客户端模型在其自身数据分布上的平均局部性能来应对这一挑战。然而,对分布外样本的泛化能力——作为FedAvg的重要优势及鲁棒性的关键组成部分——在评估流程中似乎未得到充分考量。本研究对联邦学习方法进行了全面评估,涵盖其局部性能与泛化能力。为此,我们考察单轮通信中的不同阶段,以更细致地理解相关评估指标。此外,我们提出并引入一种改进的FedAvg方法,称为“个性化更新联邦学习(FLIU)”,该算法通过添加带自适应个性化因子的简易个性化步骤进行扩展。我们在多种分布条件下(包括基准IID、病态非IID,以及专门为测试算法在复杂数据异构性压力下表现而设计的狄利克雷分布新测试环境),使用MNIST和CIFAR-10数据集对方法进行了实证评估与比较。