Deep Gradient Leakage (DGL) is a highly effective attack that recovers private training images from gradient vectors. This attack casts significant privacy challenges on distributed learning from clients with sensitive data, where clients are required to share gradients. Defending against such attacks requires but lacks an understanding of when and how privacy leakage happens, mostly because of the black-box nature of deep networks. In this paper, we propose a novel Inversion Influence Function (I$^2$F) that establishes a closed-form connection between the recovered images and the private gradients by implicitly solving the DGL problem. Compared to directly solving DGL, I$^2$F is scalable for analyzing deep networks, requiring only oracle access to gradients and Jacobian-vector products. We empirically demonstrate that I$^2$F effectively approximated the DGL generally on different model architectures, datasets, modalities, attack implementations, and perturbation-based defenses. With this novel tool, we provide insights into effective gradient perturbation directions, the unfairness of privacy protection, and privacy-preferred model initialization. Our codes are provided in https://github.com/illidanlab/inversion-influence-function.
翻译:深度梯度泄漏(Deep Gradient Leakage, DGL)是一种高效攻击,能够从梯度向量中恢复私有训练图像。这种攻击对处理敏感数据的分布式学习(其中客户端需共享梯度)构成了重大隐私挑战。防御此类攻击需要但缺乏对隐私泄漏发生时机与原因的理解,这主要源于深度网络的黑箱特性。本文提出一种新颖的逆影响函数(Inversion Influence Function, I$^2$F),通过隐式求解DGL问题,建立了恢复图像与私有梯度之间的闭式联系。与直接求解DGL相比,I$^2$F可扩展至深度网络分析,仅需对梯度和雅可比-向量积进行预言机访问。实验表明,I$^2$F在不同模型架构、数据集、模态、攻击实现及基于扰动的防御方法中,普遍能有效近似DGL。借助这一新工具,我们深入分析了有效的梯度扰动方向、隐私保护的不公平性以及利于隐私保护的模型初始化策略。我们的代码已开源至https://github.com/illidanlab/inversion-influence-function。