Federated learning is a distributed machine learning system that uses participants' data to train an improved global model. In federated learning, participants cooperatively train a global model, and they will receive the global model and payments. Rational participants try to maximize their individual utility, and they will not input their high-quality data truthfully unless they are provided with satisfactory payments based on their data quality. Furthermore, federated learning benefits from the cooperative contributions of participants. Accordingly, how to establish an incentive mechanism that both incentivizes inputting data truthfully and promotes stable cooperation has become an important issue to consider. In this paper, we introduce a data sharing game model for federated learning and employ game-theoretic approaches to design a core-selecting incentive mechanism by utilizing a popular concept in cooperative games, the core. In federated learning, the core can be empty, resulting in the core-selecting mechanism becoming infeasible. To address this, our core-selecting mechanism employs a relaxation method and simultaneously minimizes the benefits of inputting false data for all participants. However, this mechanism is computationally expensive because it requires aggregating exponential models for all possible coalitions, which is infeasible in federated learning. To address this, we propose an efficient core-selecting mechanism based on sampling approximation that only aggregates models on sampled coalitions to approximate the exact result. Extensive experiments verify that the efficient core-selecting mechanism can incentivize inputting high-quality data and stable cooperation, while it reduces computational overhead compared to the core-selecting mechanism.
翻译:联邦学习是一种利用参与者数据训练改进全局模型的分布式机器学习系统。在联邦学习中,参与者协同训练全局模型,并会获得该模型及相应报酬。理性参与者试图最大化自身效用,除非能根据其数据质量获得满意报酬,否则他们不会真实地提供高质量数据。此外,联邦学习受益于参与者的协同贡献。因此,如何建立一种既能激励真实数据输入、又能促进稳定合作的激励机制,已成为需要解决的重要问题。本文针对联邦学习引入了一个数据共享博弈模型,并运用博弈论方法,利用合作博弈中的核心概念设计了一种核选择激励机制。在联邦学习中,核心可能为空,导致核选择机制不可行。为解决此问题,我们的核选择机制采用松弛方法,同时最小化所有参与者输入虚假数据的收益。然而,该机制计算成本高昂,因为它需要聚合所有可能联盟的指数级模型,这在联邦学习中不可行。为此,我们提出一种基于采样近似的高效核选择机制,仅对采样联盟的模型进行聚合,以近似精确结果。大量实验验证表明,该高效核选择机制能激励高质量数据输入与稳定合作,同时相比核选择机制降低了计算开销。