Federated learning aims to collaboratively learn a model by using the data from multiple users under privacy constraints. In this paper, we study the multi-label classification problem under the federated learning setting, where trivial solution and extremely poor performance may be obtained, especially when only positive data w.r.t. a single class label are provided for each client. This issue can be addressed by adding a specially designed regularizer on the server-side. Although effective sometimes, the label correlations are simply ignored and thus sub-optimal performance may be obtained. Besides, it is expensive and unsafe to exchange user's private embeddings between server and clients frequently, especially when training model in the contrastive way. To remedy these drawbacks, we propose a novel and generic method termed Federated Averaging by exploring Label Correlations (FedALC). Specifically, FedALC estimates the label correlations in the class embedding learning for different label pairs and utilizes it to improve the model training. To further improve the safety and also reduce the communication overhead, we propose a variant to learn fixed class embedding for each client, so that the server and clients only need to exchange class embeddings once. Extensive experiments on multiple popular datasets demonstrate that our FedALC can significantly outperform existing counterparts.
翻译:联邦学习旨在隐私约束下,通过多用户数据协作训练模型。本文研究联邦学习场景中的多标签分类问题,当每个客户端仅提供单类别正样本数据时,可能导致平凡解或性能极差。该问题可通过在服务器端添加特殊设计的正则化器缓解,虽然有时有效,但标签相关性被直接忽略,导致次优性能。此外,在对比训练模式下频繁交换用户私有嵌入向量既昂贵又不安全。针对这些缺陷,我们提出基于标签相关性探索的联邦平均方法(FedALC)。具体而言,FedALC通过估计不同标签对的类别嵌入学习中的标签相关性,并利用该相关性改进模型训练。为进一步提升安全性并降低通信开销,我们提出变体方法为每个客户端学习固定类别嵌入,使服务器与客户端仅需交换一次类别嵌入。在多个主流数据集上的大量实验表明,FedALC显著优于现有方法。