We consider a distributed learning problem, where agents minimize a global objective function by exchanging information over a network. Our approach has two distinct features: (i) It substantially reduces communication by triggering communication only when necessary, and (ii) it is agnostic to the data-distribution among the different agents. We can therefore guarantee convergence even if the local data-distributions of the agents are arbitrarily distinct. We analyze the convergence rate of the algorithm and derive accelerated convergence rates in a convex setting. We also characterize the effect of communication drops and demonstrate that our algorithm is robust to communication failures. The article concludes by presenting numerical results from a distributed LASSO problem, and distributed learning tasks on MNIST and CIFAR-10 datasets. The experiments underline communication savings of 50% or more due to the event-based communication strategy, show resilience towards heterogeneous data-distributions, and highlight that our approach outperforms common baselines such as FedAvg, FedProx, and FedADMM.
翻译:我们考虑一类分布式学习问题,其中智能体通过交换网络信息来最小化全局目标函数。该方法具备两个显著特征:(i)通过仅在必要时触发通信,大幅降低通信开销;(ii)对智能体间的数据分布不敏感。因此,即便各智能体本地数据分布存在任意差异,我们仍能保证算法收敛。我们分析了该算法的收敛速率,并在凸优化设定下推导出加速收敛率。同时,我们刻画了通信中断的影响,证明算法对通信故障具有鲁棒性。文章最后通过分布式LASSO问题以及MNIST和CIFAR-10数据集上的分布式学习任务呈现数值实验结果。实验表明,基于事件驱动的通信策略可实现50%以上的通信节省,展现出对异构数据分布的适应性,并强调我们的方法在性能上优于FedAvg、FedProx和FedADMM等常见基线算法。