Differentially private federated learning (DP-FL) enables clients to collaboratively train machine learning models while preserving the privacy of their local data. However, most existing DP-FL approaches assume that all clients share a uniform privacy budget, an assumption that does not hold in real-world scenarios where privacy requirements vary widely. This privacy heterogeneity poses a significant challenge: conventional client selection strategies, which typically rely on data quantity, cannot distinguish between clients providing high-quality updates and those introducing substantial noise due to strict privacy constraints. To address this gap, we present the first systematic study of privacy-aware client selection in DP-FL. We establish a theoretical foundation by deriving a convergence analysis that quantifies the impact of privacy heterogeneity on training error. Building on this analysis, we propose a privacy-aware client selection strategy, formulated as a convex optimization problem, that adaptively adjusts selection probabilities to minimize training error. Extensive experiments on benchmark datasets demonstrate that our approach achieves up to a 10% improvement in test accuracy on CIFAR-10 compared to existing baselines under heterogeneous privacy budgets. These results highlight the importance of incorporating privacy heterogeneity into client selection for practical and effective federated learning.
翻译:差分隐私联邦学习(DP-FL)使多个客户端能够在保护其本地数据隐私的前提下,协作训练机器学习模型。然而,现有的大多数DP-FL方法均假设所有客户端共享统一的隐私预算,这一假设在现实场景中并不成立,因为不同客户端的隐私需求差异显著。这种隐私异质性带来了一个重大挑战:传统的客户端选择策略通常依赖于数据量,无法有效区分提供高质量更新的客户端与因严格隐私约束而引入大量噪声的客户端。为填补这一空白,我们首次对DP-FL中隐私感知的客户端选择进行了系统性研究。我们通过推导量化隐私异质性对训练误差影响的理论收敛分析,建立了理论基础。基于此分析,我们提出了一种隐私感知的客户端选择策略,该策略被构建为一个凸优化问题,能够自适应地调整选择概率以最小化训练误差。在基准数据集上进行的大量实验表明,在异质隐私预算条件下,我们的方法在CIFAR-10数据集上的测试准确率相比现有基线最高可提升10%。这些结果凸显了将隐私异质性纳入客户端选择对于实现实用且高效的联邦学习的重要性。