Collaborative learning (CL) enables multiple participants to jointly train machine learning (ML) models on decentralized data sources without raw data sharing. While the primary goal of CL is to maximize the expected accuracy gain for each participant, it is also important to ensure that the gains are fairly distributed: no client should be negatively impacted, and gains should reflect contributions. Most existing CL methods require central coordination and focus only on gain maximization, overlooking fairness. In this work, we first show that the existing measure of collaborative fairness based on the correlation between accuracy values without and with collaboration has drawbacks because it does not account for negative collaboration gain. We argue that maximizing mean collaboration gain (MCG) while simultaneously minimizing the collaboration gain spread (CGS) is a fairer alternative. Next, we propose the CYCle protocol that enables individual participants in a private decentralized learning (PDL) framework to achieve this objective through a novel reputation scoring method based on gradient alignment between the local cross-entropy and distillation losses. We further extend the CYCle protocol to operate on top of gossip-based decentralized algorithms such as Gossip-SGD. We also theoretically show that CYCle performs better than standard FedAvg in a two-client mean estimation setting under high heterogeneity. Empirical experiments demonstrate the effectiveness of the CYCle protocol to ensure positive and fair collaboration gain for all participants, even in cases where the data distributions of participants are highly skewed.
翻译:协作学习(CL)使得多个参与方能够基于去中心化的数据源联合训练机器学习(ML)模型,而无需共享原始数据。虽然CL的主要目标是最大化每个参与方的预期准确率增益,但确保增益公平分配同样重要:任何客户端都不应受到负面影响,且增益应反映其贡献。现有的大多数CL方法需要中心化协调,且仅关注增益最大化,忽视了公平性问题。在本工作中,我们首先指出,现有基于“无协作时准确率”与“有协作时准确率”之间相关性的协作公平性度量存在缺陷,因为它未考虑负协作增益。我们认为,在最大化平均协作增益(MCG)的同时最小化协作增益离散度(CGS)是一种更公平的替代方案。接着,我们提出了CYCle协议,该协议使私有去中心化学习(PDL)框架中的各个参与方能够通过一种基于局部交叉熵损失与蒸馏损失之间梯度对齐的新型信誉评分方法来实现这一目标。我们进一步将CYCle协议扩展至可在基于Gossip的去中心化算法(如Gossip-SGD)之上运行。我们还从理论上证明,在高异质性条件下,CYCle在双客户端均值估计场景中优于标准FedAvg。实证实验表明,即使在参与方数据分布高度偏斜的情况下,CYCle协议也能有效确保所有参与方获得正向且公平的协作增益。