Federated Learning (FL) is an emerging paradigm that enables multiple users to collaboratively train a robust model in a privacy-preserving manner without sharing their private data. Most existing approaches of FL only consider traditional single-label image classification, ignoring the impact when transferring the task to multi-label image classification. Nevertheless, it is still challenging for FL to deal with user heterogeneity in their local data distribution in the real-world FL scenario, and this issue becomes even more severe in multi-label image classification. Inspired by the recent success of Transformers in centralized settings, we propose a novel FL framework for multi-label classification. Since partial label correlation may be observed by local clients during training, direct aggregation of locally updated models would not produce satisfactory performances. Thus, we propose a novel FL framework of Language-Guided Transformer (FedLGT) to tackle this challenging task, which aims to exploit and transfer knowledge across different clients for learning a robust global model. Through extensive experiments on various multi-label datasets (e.g., FLAIR, MS-COCO, etc.), we show that our FedLGT is able to achieve satisfactory performance and outperforms standard FL techniques under multi-label FL scenarios. Code is available at https://github.com/Jack24658735/FedLGT.
翻译:联邦学习(FL)是一种新兴范式,允许多个用户在保护隐私的前提下协作训练稳健模型,而无需共享其私有数据。现有FL方法大多仅考虑传统单标签图像分类,忽略了将其迁移至多标签图像分类任务时的影响。然而,在实际FL场景中,FL仍需应对用户本地数据分布的异质性挑战,这一问题在多标签图像分类中尤为严峻。受Transformer在集中式场景中成功应用的启发,我们提出了一种针对多标签分类的新型FL框架。由于局部客户端在训练过程中可能仅能观测到部分标签相关性,直接聚合局部更新模型无法获得令人满意的性能。为此,我们提出一种新颖的FL框架——语言引导的Transformer(FedLGT),旨在挖掘并跨客户端传递知识,以学习稳健的全局模型。通过在多个多标签数据集(如FLAIR、MS-COCO等)上的大量实验,我们证明FedLGT能够取得令人满意的性能,并在多标签FL场景下优于标准FL技术。代码开源地址:https://github.com/Jack24658735/FedLGT。