Federated learning (FL) enables multiple clients to train models collaboratively without sharing local data, which has achieved promising results in different areas, including the Internet of Things (IoT). However, end IoT devices do not have abilities to automatically annotate their collected data, which leads to the label shortage issue at the client side. To collaboratively train an FL model, we can only use a small number of labeled data stored on the server. This is a new yet practical scenario in federated learning, i.e., labels-at-server semi-supervised federated learning (SemiFL). Although several SemiFL approaches have been proposed recently, none of them can focus on the personalization issue in their model design. IoT environments make SemiFL more challenging, as we need to take device computational constraints and communication cost into consideration simultaneously. To tackle these new challenges together, we propose a novel SemiFL framework named pFedKnow. pFedKnow generates lightweight personalized client models via neural network pruning techniques to reduce communication cost. Moreover, it incorporates pretrained large models as prior knowledge to guide the aggregation of personalized client models and further enhance the framework performance. Experiment results on both image and text datasets show that the proposed pFedKnow outperforms state-of-the-art baselines as well as reducing considerable communication cost. The source code of the proposed pFedKnow is available at https://github.com/JackqqWang/pfedknow/tree/master.
翻译:联邦学习(FL)允许多个客户端在不共享本地数据的情况下协作训练模型,已在包括物联网(IoT)在内的多个领域取得了显著成果。然而,终端物联网设备缺乏自动标注其采集数据的能力,导致客户端侧存在标签短缺问题。为协作训练联邦学习模型,我们只能利用服务器上存储的少量标注数据。这是联邦学习中一个全新且实际的应用场景,即"标签在服务器端的半监督联邦学习(SemiFL)"。尽管近期已有若干SemiFL方法被提出,但鲜有研究在模型设计中关注个性化问题。物联网环境使SemiFL更具挑战性,因为我们需要同时兼顾设备计算约束与通信成本。为应对这些新挑战,我们提出了一种名为pFedKnow的创新型SemiFL框架。pFedKnow通过神经网络剪枝技术生成轻量级个性化客户端模型以降低通信成本,同时引入预训练大模型作为先验知识,指导个性化客户端模型的聚合过程并进一步增强框架性能。在图像和文本数据集上的实验结果表明,所提出的pFedKnow方法不仅超越现有最优基准方法,还能显著降低通信成本。本方法的源代码已发布于https://github.com/JackqqWang/pfedknow/tree/master。