This paper aims to design a Privacy-aware Client Sampling framework in Federated learning, named FedPCS, to tackle the heterogeneous client sampling issues and improve model performance. First, we obtain a pioneering upper bound for the accuracy loss of the FL model with privacy-aware client sampling probabilities. Based on this, we model the interactions between the central server and participating clients as a two-stage Stackelberg game. In Stage I, the central server designs the optimal time-dependent reward for cost minimization by considering the trade-off between the accuracy loss of the FL model and the rewards allocated. In Stage II, each client determines the correction factor that dynamically adjusts its privacy budget based on the reward allocated to maximize its utility. To surmount the obstacle of approximating other clients' private information, we introduce the mean-field estimator to estimate the average privacy budget. We analytically demonstrate the existence and convergence of the fixed point for the mean-field estimator and derive the Stackelberg Nash Equilibrium to obtain the optimal strategy profile. By rigorously theoretical convergence analysis, we guarantee the robustness of FedPCS. Moreover, considering the conventional sampling strategy in privacy-preserving FL, we prove that the random sampling approach's PoA can be arbitrarily large. To remedy such efficiency loss, we show that the proposed privacy-aware client sampling strategy successfully reduces PoA, which is upper bounded by a reachable constant. To address the challenge of varying privacy requirements throughout different training phases in FL, we extend our model and analysis and derive the adaptive optimal sampling ratio for the central server. Experimental results on different datasets demonstrate the superiority of FedPCS compared with the existing SOTA FL strategies under IID and Non-IID datasets.
翻译:本文旨在设计一种面向隐私保护的联邦学习客户端抽样框架(FedPCS),以解决异构客户端抽样问题并提升模型性能。首先,我们推导出具有隐私感知客户端抽样概率的联邦学习模型精度损失的一个开创性上界。基于此,我们将中央服务器与参与客户端之间的交互建模为一个两阶段Stackelberg博弈。在第一阶段,中央服务器通过权衡联邦学习模型的精度损失与分配的奖励,设计最优的时间依赖性奖励以实现成本最小化。在第二阶段,每个客户端根据所分配的奖励确定修正因子,以动态调整其隐私预算,从而实现自身效用最大化。为克服近似其他客户端私有信息的障碍,我们引入平均场估计器来估计平均隐私预算。我们解析地证明了平均场估计器不动点的存在性与收敛性,并推导出Stackelberg纳什均衡以获得最优策略组合。通过严格的理论收敛性分析,我们保证了FedPCS的鲁棒性。此外,考虑到隐私保护联邦学习中的传统抽样策略,我们证明了随机抽样方法的无政府代价(PoA)可能任意大。为弥补这种效率损失,我们表明所提出的隐私感知客户端抽样策略能有效降低无政府代价,其上界为一个可达常数。针对联邦学习中不同训练阶段隐私需求动态变化的挑战,我们扩展了模型与分析,推导出中央服务器的自适应最优抽样比例。在不同数据集上的实验结果表明,在独立同分布与非独立同分布数据集下,FedPCS相较于现有SOTA联邦学习策略均具有优越性。