Recent works have brought attention to the vulnerability of Federated Learning (FL) systems to gradient leakage attacks. Such attacks exploit clients' uploaded gradients to reconstruct their sensitive data, thereby compromising the privacy protection capability of FL. In response, various defense mechanisms have been proposed to mitigate this threat by manipulating the uploaded gradients. Unfortunately, empirical evaluations have demonstrated limited resilience of these defenses against sophisticated attacks, indicating an urgent need for more effective defenses. In this paper, we explore a novel defensive paradigm that departs from conventional gradient perturbation approaches and instead focuses on the construction of robust data. Intuitively, if robust data exhibits low semantic similarity with clients' raw data, the gradients associated with robust data can effectively obfuscate attackers. To this end, we design Refiner that jointly optimizes two metrics for privacy protection and performance maintenance. The utility metric is designed to promote consistency between the gradients of key parameters associated with robust data and those derived from clients' data, thus maintaining model performance. Furthermore, the privacy metric guides the generation of robust data towards enlarging the semantic gap with clients' data. Theoretical analysis supports the effectiveness of Refiner, and empirical evaluations on multiple benchmark datasets demonstrate the superior defense effectiveness of Refiner at defending against state-of-the-art attacks.
翻译:近期研究揭示了联邦学习(FL)系统易受梯度泄露攻击的脆弱性。此类攻击利用客户端上传的梯度重建其敏感数据,从而破坏FL的隐私保护能力。为应对这一威胁,研究人员提出了多种通过操控上传梯度来缓解风险的防御机制。然而,实验评估表明,现有防御措施对高级攻击的鲁棒性有限,亟需更有效的防御方案。本文探索了一种不同于传统梯度扰动的新型防御范式,该范式聚焦于鲁棒数据的构建。直观而言,若鲁棒数据与客户端原始数据的语义相似性较低,则与鲁棒数据关联的梯度能有效混淆攻击者。为此,我们设计了Refiner方法,通过联合优化隐私保护与性能维持两项指标:效用指标旨在增强鲁棒数据关联的关键参数梯度与客户端数据派生梯度之间的一致性,从而维持模型性能;隐私指标则引导鲁棒数据生成过程扩大其与客户端数据的语义差距。理论分析验证了Refiner的有效性,在多个基准数据集上的实验评估表明,Refiner在抵御最先进攻击时展现出优越的防御效能。