Mechanism for Collaborative Federated Learning: Pitfalls of Shapley Values

This paper investigates the impact of mechanism design on collaborative learning systems enabled by federated learning (FL). We propose a multi-action collaborative federated learning (MCFL) framework, capturing the interplay between agent strategies, platform mechanisms, and FL algorithms--a "three-body problem" in collaborative learning. This work demonstrates how the convergence rate and computational efficiency of FL are endogenously determined by the agent participation equilibrium that is induced by the mechanism. By doing so, we establish a direct link between incentive design in collaborative learning systems and the performance of the underlying optimization algorithms, a connection that has been largely overlooked in the existing literature. Specifically, we characterize the equilibrium of agent participation under two prominent mechanisms: the Shapley Value (SV) and Marginal Contribution (MC) mechanisms. Although SV is fair in surplus allocation and budget balanced, it has a vital pitfall: agents are incentivized to split their data across newly created fake identities. This is critical especially in the MCFL setting as it leads to slow convergence of FL optimization, which increases the number of required synchronization/communication rounds even when the per-round cost is fixed. In contrast, while MC is not budget-balanced, it is robust to such strategic manipulation and is able to induce an equilibrium that maximizes the MCFL system efficiency. Overall, our study lays a foundation for jointly designing incentives and algorithms in MCFL systems. We provide insights on pitfalls of SV: it induces a system equilibrium that leads to tremendous training cost and slower convergence, ultimately undermining the effectiveness of collaborative learning.

翻译：本文研究了机制设计对联邦学习（FL）赋能协作学习系统的影响。我们提出了一种多动作协作联邦学习（MCFL）框架，刻画了智能体策略、平台机制与FL算法之间的相互作用——这一协作学习中的“三体问题”。本研究展示了FL的收敛速度与计算效率如何由机制所诱发的智能体参与均衡内生决定。通过这一分析，我们在协作学习系统的激励设计与底层优化算法的性能之间建立了直接联系，这一联系在现有文献中很大程度上被忽视。具体而言，我们刻画了两种主流机制下智能体参与的均衡特征：Shapley值（SV）机制与边际贡献（MC）机制。尽管SV在剩余分配中公平且满足预算平衡，但它存在一个关键陷阱：智能体有动机将其数据拆分至新创建的虚假身份。这在MCFL环境中尤为严重，因为它导致FL优化收敛缓慢，即使每轮成本固定，也增加了所需的同步/通信轮次。相比之下，虽然MC不满足预算平衡，但它对此类策略操纵具有鲁棒性，并能诱导出最大化MCFL系统效率的均衡。总体而言，我们的研究为MCFL系统中激励与算法的联合设计奠定了基础。我们揭示了SV的陷阱：它诱发的系统均衡会导致训练成本急剧增加和收敛速度减慢，最终削弱协作学习的有效性。