Federated Learning (FL) is a decentralized learning method used to train machine learning algorithms. In FL, a global model iteratively collects the parameters of local models without accessing their local data. However, a significant challenge in FL is handling the heterogeneity of local data distribution, which often results in a drifted global model that is difficult to converge. To address this issue, current methods employ different strategies such as knowledge distillation, weighted model aggregation, and multi-task learning. These approaches are referred to as asynchronous FL, as they align user models either locally or post-hoc, where model drift has already occurred or has been underestimated. In this paper, we propose an active and synchronous correlation approach to address the challenge of user heterogeneity in FL. Specifically, our approach aims to approximate FL as standard deep learning by actively and synchronously scheduling user learning pace in each round with a dynamic multi-phase curriculum. A global curriculum is formed by an auto-regressive auto-encoder that integrates all user curricula on the server. This global curriculum is then divided into multiple phases and broadcast to users to measure and align the domain-agnostic learning pace. Empirical studies demonstrate that our approach outperforms existing asynchronous approaches in terms of generalization performance, even in the presence of severe user heterogeneity.
翻译:联邦学习(FL)是一种用于训练机器学习算法的去中心化学习方法。在FL中,全局模型迭代收集各局部模型的参数,而无需访问其本地数据。然而,FL面临的一个重大挑战是处理局部数据分布的异质性,这通常会导致全局模型发生漂移而难以收敛。为解决此问题,现有方法采用不同策略,如知识蒸馏、加权模型聚合及多任务学习。这些方法被称为异步FL,原因是它们要么在局部对齐用户模型,要么在模型漂移已经发生或已被低估时进行事后对齐。本文提出一种主动同步关联方法,以应对FL中用户异质性的挑战。具体而言,我们的方法旨在通过动态多阶段课程主动同步调度每轮用户的本地学习步调,从而使FL近似于标准深度学习。通过一个自回归自编码器整合服务器上所有用户课程,形成全局课程。该全局课程被划分为多个阶段并广播给用户,以衡量和对齐与领域无关的学习步调。实证研究表明,即使在严重的用户异质性下,我们的方法在泛化性能上仍优于现有异步方法。