Decentralized Federated Learning (DFL) has received significant recent research attention, capturing settings where both model updates and model aggregations -- the two key FL processes -- are conducted by the clients. In this work, we propose Decentralized Sporadic Federated Learning ($\texttt{DSpodFL}$), a DFL methodology which generalizes the notion of sporadicity in both of these processes, modeling the impact of different forms of heterogeneity that manifest in realistic DFL settings. $\texttt{DSpodFL}$ unifies many of the prominent decentralized optimization methods, e.g., distributed gradient descent (DGD), randomized gossip (RG), and decentralized federated averaging (DFedAvg), under a single modeling framework. We analytically characterize the convergence behavior of $\texttt{DSpodFL}$, showing, among other insights, that we can match a geometric convergence rate to a finite optimality gap under more general assumptions than in existing works. Through experiments, we demonstrate that $\texttt{DSpodFL}$ achieves significantly improved training speeds and robustness to variations in system parameters compared to the state-of-the-art.
翻译:去中心化联邦学习(DFL)近期受到了研究界的广泛关注,其捕捉了模型更新与模型聚合——这两个关键的联邦学习过程——均由客户端执行的场景。在这项工作中,我们提出了去中心化间歇性联邦学习($\texttt{DSpodFL}$),这是一种将间歇性概念泛化至这两个过程中的DFL方法论,建模了现实DFL设置中不同形式的异质性所带来的影响。$\texttt{DSpodFL}$将许多主流去中心化优化方法(如分布式梯度下降(DGD)、随机八卦(RG)和去中心化联邦平均(DFedAvg))统一于单一建模框架下。我们从理论上刻画了$\texttt{DSpodFL}$的收敛行为,其中关键见解包括:在比现有工作更一般的假设下,我们能够匹配几何收敛速率至有限最优性间隙。通过实验,我们证明了相较于现有最先进方法,$\texttt{DSpodFL}$在训练速度和对系统参数变化的鲁棒性方面均实现了显著提升。