Federated learning (FL) is a machine learning paradigm that allows multiple FL participants (FL-PTs) to collaborate on training models without sharing private data. Due to data heterogeneity, negative transfer may occur in the FL training process. This necessitates FL-PT selection based on their data complementarity. In cross-silo FL, organizations that engage in business activities are key sources of FL-PTs. The resulting FL ecosystem has two features: (i) self-interest, and (ii) competition among FL-PTs. This requires the desirable FL-PT selection strategy to simultaneously mitigate the problems of free riders and conflicts of interest among competitors. To this end, we propose an optimal FL collaboration formation strategy -- FedEgoists -- which ensures that: (1) a FL-PT can benefit from FL if and only if it benefits the FL ecosystem, and (2) a FL-PT will not contribute to its competitors or their supporters. It provides an efficient clustering solution to group FL-PTs into coalitions, ensuring that within each coalition, FL-PTs share the same interest. We theoretically prove that the FL-PT coalitions formed are optimal since no coalitions can collaborate together to improve the utility of any of their members. Extensive experiments on widely adopted benchmark datasets demonstrate the effectiveness of FedEgoists compared to nine state-of-the-art baseline methods, and its ability to establish efficient collaborative networks in cross-silos FL with FL-PTs that engage in business activities.
翻译:联邦学习(FL)是一种允许多个联邦学习参与方(FL-PT)在不共享私有数据的情况下协作训练模型的机器学习范式。由于数据异质性,FL训练过程中可能出现负迁移现象,这要求基于数据互补性对FL-PT进行筛选。在跨孤岛FL场景中,从事商业活动的组织是FL-PT的主要来源,由此形成的FL生态系统具有两个特征:(1)自利性;(2)FL-PT之间存在竞争关系。这要求理想的FL-PT选择策略需同时缓解搭便车问题与竞争者之间的利益冲突。为此,我们提出一种最优FL协作联盟构建策略——FedEgoists,该策略确保:(1)仅当FL-PT能为FL生态系统带来收益时,其自身才能从中获益;(2)FL-PT不会为其竞争者或竞争者的支持者提供贡献。该策略通过高效的聚类解决方案将FL-PT分组为联盟,确保每个联盟内的FL-PT具有共同利益。我们通过理论证明,所构建的FL-PT联盟具有最优性,因为不存在任何联盟能通过协作提升其成员效用。在广泛采用的基准数据集上的大量实验表明,相较于九种先进基线方法,FedEgoists在跨孤岛FL场景中能有效构建协作网络,尤其适用于从事商业活动的FL-PT。