Federated Learning is a privacy-preserving decentralized approach for Machine Learning tasks. In industry deployments characterized by a limited number of entities possessing abundant data, the significance of a participant's role in shaping the global model becomes pivotal given that participation in a federation incurs costs, and participants may expect compensation for their involvement. Additionally, the contributions of participants serve as a crucial means to identify and address potential malicious actors and free-riders. However, fairly assessing individual contributions remains a significant hurdle. Recent works have demonstrated a considerable inherent instability in contribution estimations across aggregation strategies. While employing a different strategy may offer convergence benefits, this instability can have potentially harming effects on the willingness of participants in engaging in the federation. In this work, we introduce FedRandom, a novel mitigation technique to the contribution instability problem. Tackling the instability as a statistical estimation problem, FedRandom allows us to generate more samples than when using regular FL strategies. We show that these additional samples provide a more consistent and reliable evaluation of participant contributions. We demonstrate our approach using different data distributions across CIFAR-10, MNIST, CIFAR-100 and FMNIST and show that FedRandom reduces the overall distance to the ground truth by more than a third in half of all evaluated scenarios, and improves stability in more than 90% of cases.
翻译:联邦学习是一种保护隐私的分布式机器学习方法。在工业部署中,由于参与联邦会产生成本且参与者可能期望获得补偿,在实体数量有限但数据丰富的场景下,参与者在塑造全局模型中的作用变得至关重要。此外,参与者贡献评估也是识别和应对潜在恶意行为者及搭便车者的关键手段。然而,公平评估个体贡献仍是一个重大挑战。近期研究表明,不同聚合策略下的贡献评估存在显著的内在不稳定性。虽然采用不同策略可能带来收敛优势,但这种不稳定性可能对参与者持续参与联邦的意愿产生不利影响。本文提出FedRandom,一种针对贡献不稳定性问题的新型缓解技术。通过将不稳定性问题转化为统计估计问题,FedRandom能够生成比常规联邦学习策略更多的样本。我们证明这些额外样本可以提供更一致、更可靠的参与者贡献评估。我们在CIFAR-10、MNIST、CIFAR-100和FMNIST数据集上采用不同数据分布进行实验验证,结果表明FedRandom在超过半数的评估场景中将与真实贡献的总体距离减少三分之一以上,并在超过90%的情况下提升了评估稳定性。