FedCompass: Efficient Cross-Silo Federated Learning on Heterogeneous Client Devices using a Computing Power Aware Scheduler

Cross-silo federated learning offers a promising solution to collaboratively train robust and generalized AI models without compromising the privacy of local datasets, e.g., healthcare, financial, as well as scientific projects that lack a centralized data facility. Nonetheless, because of the disparity of computing resources among different clients (i.e., device heterogeneity), synchronous federated learning algorithms suffer from degraded efficiency when waiting for straggler clients. Similarly, asynchronous federated learning algorithms experience degradation in the convergence rate and final model accuracy on non-identically and independently distributed (non-IID) heterogeneous datasets due to stale local models and client drift. To address these limitations in cross-silo federated learning with heterogeneous clients and data, we propose FedCompass, an innovative semi-asynchronous federated learning algorithm with a computing power-aware scheduler on the server side, which adaptively assigns varying amounts of training tasks to different clients using the knowledge of the computing power of individual clients. FedCompass ensures that multiple locally trained models from clients are received almost simultaneously as a group for aggregation, effectively reducing the staleness of local models. At the same time, the overall training process remains asynchronous, eliminating prolonged waiting periods from straggler clients. Using diverse non-IID heterogeneous distributed datasets, we demonstrate that FedCompass achieves faster convergence and higher accuracy than other asynchronous algorithms while remaining more efficient than synchronous algorithms when performing federated learning on heterogeneous clients. The source code for FedCompass is available at https://github.com/APPFL/FedCompass.

翻译：跨数据孤岛联邦学习提供了一种有前景的解决方案，可在不损害本地数据集（例如医疗、金融及缺乏集中数据设施的科研项目）隐私的前提下，协同训练鲁棒且泛化能力强的AI模型。然而，由于不同客户端之间计算资源的差异（即设备异构性），同步联邦学习算法在等待掉队客户端时会遭遇效率下降的问题。类似地，异步联邦学习算法在处理非独立同分布（non-IID）异构数据集时，会因过时模型和客户端漂移而导致收敛速度与最终模型精度降低。为应对异构客户端与数据在跨数据孤岛联邦学习中的上述局限，我们提出FedCompass——一种创新的半异步联邦学习算法，其在服务器端配备计算能力感知调度器，能够基于各客户端的计算性能知识自适应分配不同数量的训练任务。FedCompass确保来自多个客户端的本地训练模型作为一组几乎同时被接收以进行聚合，有效减少了本地模型的陈旧性。同时，整体训练过程保持异步，消除了因等待掉队客户端而产生的长时间停滞。通过多样化的非独立同分布异构分布式数据集，我们证明FedCompass在异构客户端联邦学习中实现了比异步算法更快的收敛速度和更高精度，同时仍比同步算法更具效率。FedCompass的源代码可在https://github.com/APPFL/FedCompass获取。

相关内容

联邦学习

关注 200

联邦学习（Federated Learning）是一种新兴的人工智能基础技术，在 2016 年由谷歌最先提出，原本用于解决安卓手机终端用户在本地更新模型的问题，其设计目标是在保障大数据交换时的信息安全、保护终端数据和个人数据隐私、保证合法合规的前提下，在多参与方或多计算结点之间开展高效率的机器学习。其中，联邦学习可使用的机器学习算法不局限于神经网络，还包括随机森林等重要算法。联邦学习有望成为下一代人工智能协同算法和协作网络的基础。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日