Federated learning (FL) is a machine learning paradigm that allows multiple FL participants (FL-PTs) to collaborate on training models without sharing private data. Due to data heterogeneity, negative transfer may occur in the FL training process. This necessitates FL-PT selection based on their data complementarity. In cross-silo FL, organizations that engage in business activities are key sources of FL-PTs. The resulting FL ecosystem has two features: (i) self-interest, and (ii) competition among FL-PTs. This requires the desirable FL-PT selection strategy to simultaneously mitigate the problems of free riders and conflicts of interest among competitors. To this end, we propose an optimal FL collaboration formation strategy -- FedEgoists -- which ensures that: (1) a FL-PT can benefit from FL if and only if it benefits the FL ecosystem, and (2) a FL-PT will not contribute to its competitors or their supporters. It provides an efficient clustering solution to group FL-PTs into coalitions, ensuring that within each coalition, FL-PTs share the same interest. We theoretically prove that the FL-PT coalitions formed are optimal since no coalitions can collaborate together to improve the utility of any of their members. Extensive experiments on widely adopted benchmark datasets demonstrate the effectiveness of FedEgoists compared to nine state-of-the-art baseline methods, and its ability to establish efficient collaborative networks in cross-silos FL with FL-PTs that engage in business activities.
翻译:联邦学习(Federated Learning, FL)是一种机器学习范式,允许多个联邦学习参与者(FL-PTs)在不共享私有数据的情况下协作训练模型。由于数据异质性,FL训练过程中可能出现负迁移现象,这要求基于数据互补性对FL-PTs进行筛选。在跨孤岛联邦学习中,从事商业活动的组织是FL-PTs的主要来源,由此形成的FL生态系统具有两个特征:(一)自利性;(二)FL-PTs之间的竞争关系。这要求理想的FL-PT选择策略需同时缓解搭便车问题与竞争者之间的利益冲突。为此,我们提出一种最优的FL协作联盟构建策略——FedEgoists,该策略确保:(1)当且仅当对FL生态系统有益时,FL-PT才能从FL中获益;(2)FL-PT不会为其竞争者或其支持者提供贡献。该策略通过高效聚类算法将FL-PTs分组为联盟,确保每个联盟内的FL-PTs具有共同利益。我们通过理论证明,所构建的FL-PT联盟具有最优性,因为不存在任何联盟能通过协作提升其成员效用。在广泛采用的基准数据集上的大量实验表明,相较于九种先进基线方法,FedEgoists在包含商业活动FL-PTs的跨孤岛联邦学习中能有效建立高效协作网络。