Federated Learning (FL) has achieved significant achievements recently, enabling collaborative model training on distributed data over edge devices. Iterative gradient or model exchanges between devices and the centralized server in the standard FL paradigm suffer from severe efficiency bottlenecks on the server. While enabling collaborative training without a central server, existing decentralized FL approaches either focus on the synchronous mechanism that deteriorates FL convergence or ignore device staleness with an asynchronous mechanism, resulting in inferior FL accuracy. In this paper, we propose an Asynchronous Efficient Decentralized FL framework, i.e., AEDFL, in heterogeneous environments with three unique contributions. First, we propose an asynchronous FL system model with an efficient model aggregation method for improving the FL convergence. Second, we propose a dynamic staleness-aware model update approach to achieve superior accuracy. Third, we propose an adaptive sparse training method to reduce communication and computation costs without significant accuracy degradation. Extensive experimentation on four public datasets and four models demonstrates the strength of AEDFL in terms of accuracy (up to 16.3% higher), efficiency (up to 92.9% faster), and computation costs (up to 42.3% lower).
翻译:联邦学习(FL)近年来取得了显著成就,能够利用边缘设备上的分布式数据进行协同模型训练。标准FL范式中,设备与集中式服务器之间的迭代梯度或模型交换会在服务器端产生严重的效率瓶颈。现有去中心化FL方法虽然无需中央服务器即可实现协同训练,但要么采用同步机制导致FL收敛性能下降,要么采用异步机制却忽略设备陈旧性问题,从而造成FL精度降低。本文提出了一种面向异构环境的异步高效去中心化FL框架——AEDFL,其贡献包括三点:第一,提出一种异步FL系统模型及高效模型聚合方法以提升FL收敛性能;第二,提出一种动态的陈旧性感知模型更新方法以实现更优精度;第三,提出一种自适应稀疏训练方法,在降低通信与计算开销的同时避免显著精度损失。在四个公开数据集和四种模型上的大量实验表明,AEDFL在精度(最高提升16.3%)、效率(最高提升92.9%)和计算开销(最高降低42.3%)方面均表现出显著优势。