In federated learning, all networked clients contribute to the model training cooperatively. However, with model sizes increasing, even sharing the trained partial models often leads to severe communication bottlenecks in underlying networks, especially when communicated iteratively. In this paper, we introduce a federated learning framework FedD3 requiring only one-shot communication by integrating dataset distillation instances. Instead of sharing model updates in other federated learning approaches, FedD3 allows the connected clients to distill the local datasets independently, and then aggregates those decentralized distilled datasets (e.g. a few unrecognizable images) from networks for model training. Our experimental results show that FedD3 significantly outperforms other federated learning frameworks in terms of needed communication volumes, while it provides the additional benefit to be able to balance the trade-off between accuracy and communication cost, depending on usage scenario or target dataset. For instance, for training an AlexNet model on CIFAR-10 with 10 clients under non-independent and identically distributed (Non-IID) setting, FedD3 can either increase the accuracy by over 71% with a similar communication volume, or save 98% of communication volume, while reaching the same accuracy, compared to other one-shot federated learning approaches.
翻译:在联邦学习中,所有联网客户端协同参与模型训练。然而,随着模型规模的增大,即便是共享训练好的部分模型,也常导致底层网络出现严重的通信瓶颈,尤其是在迭代通信的场景下。本文提出联邦学习框架FedD3,通过集成数据集蒸馏实例实现仅需单次通信。与其他联邦学习方法共享模型更新不同,FedD3允许联网客户端独立蒸馏本地数据集,然后聚合网络中这些去中心化蒸馏数据集(例如少量不可辨识的图像)用于模型训练。实验结果表明,FedD3在所需通信量方面显著优于其他联邦学习框架,同时具备根据使用场景或目标数据集平衡准确率与通信成本之间权衡的额外优势。例如,在非独立同分布(Non-IID)设置下使用10个客户端在CIFAR-10上训练AlexNet模型时,相比其他单次联邦学习方法,FedD3可在相似通信量下将准确率提升71%以上,或在达到相同准确率时节省98%的通信量。