Federated learning aims to share private data to maximize the data utility without privacy leakage. Previous federated learning research mainly focuses on multi-class classification problems. However, multi-label classification is a crucial research problem close to real-world data properties. Nevertheless, a limited number of federated learning studies explore this research problem. Existing studies of multi-label federated learning did not consider the characteristics of multi-label data, i.e., they used the concept of multi-class classification to verify their methods' performance, which means it will not be feasible to apply their methods to real-world applications. Therefore, this study proposed a new multi-label federated learning framework with a Clustering-based Multi-label Data Allocation (CMDA) and a novel aggregation method, Fast Label-Adaptive Aggregation (FLAG), for multi-label classification in the federated learning environment. The experimental results demonstrate that our methods only need less than 50\% of training epochs and communication rounds to surpass the performance of state-of-the-art federated learning methods.
翻译:摘要:联邦学习旨在通过共享私有数据最大化数据效用,同时避免隐私泄露。现有联邦学习研究主要聚焦于多类分类问题,然而多标签分类作为贴近真实数据特性的重要研究问题却鲜少被探索。已有联邦学习研究未考虑多标签数据的特征,即沿用多类分类的概念验证方法性能,导致其方法难以应用于实际场景。为此,本研究提出一种新型多标签联邦学习框架,包含基于聚类的多标签数据分配方法及创新的快速标签自适应聚合方法,专门用于联邦学习环境下的多标签分类任务。实验结果表明,本方法仅需不到50%的训练轮次和通信轮数即可超越当前最先进的联邦学习方法。