去中心化间歇性联邦学习：一个具有收敛保证的统一算法框架 (Decentralized Sporadic Federated Learning: A Unified Algorithmic Framework with Convergence Guarantees)

Decentralized federated learning (DFL) captures FL settings where both (i) model updates and (ii) model aggregations are exclusively carried out by the clients without a central server. Existing DFL works have mostly focused on settings where clients conduct a fixed number of local updates between local model exchanges, overlooking heterogeneity and dynamics in communication and computation capabilities. In this work, we propose Decentralized Sporadic Federated Learning ($\texttt{DSpodFL}$), a DFL methodology built on a generalized notion of $\textit{sporadicity}$ in both local gradient and aggregation processes. $\texttt{DSpodFL}$ subsumes many existing decentralized optimization methods under a unified algorithmic framework by modeling the per-iteration (i) occurrence of gradient descent at each client and (ii) exchange of models between client pairs as arbitrary indicator random variables, thus capturing $\textit{heterogeneous and time-varying}$ computation/communication scenarios. We analytically characterize the convergence behavior of $\texttt{DSpodFL}$ for both convex and non-convex models and for both constant and diminishing learning rates, under mild assumptions on the communication graph connectivity, data heterogeneity across clients, and gradient noises. We show how our bounds recover existing results from decentralized gradient descent as special cases. Experiments demonstrate that $\texttt{DSpodFL}$ consistently achieves improved training speeds compared with baselines under various system settings.

翻译：去中心化联邦学习（DFL）描述了这样一种联邦学习场景：其中（i）模型更新与（ii）模型聚合均完全由客户端执行，无需中央服务器参与。现有的DFL研究大多集中于客户端在本地模型交换之间执行固定次数本地更新的场景，忽略了通信与计算能力的异构性与动态性。在本工作中，我们提出去中心化间歇性联邦学习（$\texttt{DSpodFL}$），这是一种建立在局部梯度与聚合过程$\textit{间歇性}$广义概念上的DFL方法。$\texttt{DSpodFL}$通过将每次迭代中（i）每个客户端执行梯度下降的发生以及（ii）客户端对之间模型交换建模为任意指示随机变量，从而将许多现有的去中心化优化方法纳入一个统一的算法框架，进而捕捉$\textit{异构且时变}$的计算/通信场景。我们在对通信图连通性、客户端间数据异构性以及梯度噪声的温和假设下，从理论上分析了$\texttt{DSpodFL}$在凸与非凸模型、恒定与递减学习率下的收敛行为。我们展示了如何从我们的界中恢复去中心化梯度下降的现有结果作为特例。实验表明，在各种系统设置下，$\texttt{DSpodFL}$相较于基线方法始终能实现更快的训练速度。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日