Trade Privacy for Utility: A Learning-Based Privacy Pricing Game in Federated Learning

To prevent implicit privacy disclosure in sharing gradients among data owners (DOs) under federated learning (FL), differential privacy (DP) and its variants have become a common practice to offer formal privacy guarantees with low overheads. However, individual DOs generally tend to inject larger DP noises for stronger privacy provisions (which entails severe degradation of model utility), while the curator (i.e., aggregation server) aims to minimize the overall effect of added random noises for satisfactory model performance. To address this conflicting goal, we propose a novel dynamic privacy pricing (DyPP) game which allows DOs to sell individual privacy (by lowering the scale of locally added DP noise) for differentiated economic compensations (offered by the curator), thereby enhancing FL model utility. Considering multi-dimensional information asymmetry among players (e.g., DO's data distribution and privacy preference, and curator's maximum affordable payment) as well as their varying private information in distinct FL tasks, it is hard to directly attain the Nash equilibrium of the mixed-strategy DyPP game. Alternatively, we devise a fast reinforcement learning algorithm with two layers to quickly learn the optimal mixed noise-saving strategy of DOs and the optimal mixed pricing strategy of the curator without prior knowledge of players' private information. Experiments on real datasets validate the feasibility and effectiveness of the proposed scheme in terms of faster convergence speed and enhanced FL model utility with lower payment costs.

翻译：为防止联邦学习中数据所有者在共享梯度时引发隐式隐私泄露，差分隐私及其变体已成为一种常见实践，能以较低开销提供形式化隐私保障。然而，个体数据所有者倾向于注入更大的差分隐私噪声以增强隐私保护（这将导致模型效用的严重下降），而管理员（即聚合服务器）则旨在最小化添加随机噪声的总体影响以获得满意的模型性能。为解决这一冲突目标，我们提出了一种新颖的动态隐私定价博弈，允许数据所有者通过降低本地添加的差分隐私噪声规模来出售个体隐私，以换取管理员提供的差异化经济补偿，从而提升联邦学习模型效用。考虑到参与者之间多维度的信息不对称（例如数据所有者的数据分布与隐私偏好、管理员的最大可承受支付额度）以及他们在不同联邦学习任务中变化的私有信息，直接达成混合策略动态隐私定价博弈的纳什均衡较为困难。为此，我们设计了一种双层快速强化学习算法，可在无需掌握参与者私有信息的先验知识下，快速学习数据所有者的最优混合噪声节省策略及管理员的最优混合定价策略。在真实数据集上的实验验证了所提方案在更快收敛速度、更低支付成本下提升联邦学习模型效用的可行性与有效性。