Deep Gradient Leakage (DGL) is a highly effective attack that recovers private training images from gradient vectors. This attack casts significant privacy challenges on distributed learning from clients with sensitive data, where clients are required to share gradients. Defending against such attacks requires but lacks an understanding of when and how privacy leakage happens, mostly because of the black-box nature of deep networks. In this paper, we propose a novel Inversion Influence Function (I$^2$F) that establishes a closed-form connection between the recovered images and the private gradients by implicitly solving the DGL problem. Compared to directly solving DGL, I$^2$F is scalable for analyzing deep networks, requiring only oracle access to gradients and Jacobian-vector products. We empirically demonstrate that I$^2$F effectively approximated the DGL generally on different model architectures, datasets, attack implementations, and noise-based defenses. With this novel tool, we provide insights into effective gradient perturbation directions, the unfairness of privacy protection, and privacy-preferred model initialization. Our codes are provided in https://github.com/illidanlab/inversion-influence-function.
翻译:深度梯度泄漏(DGL)是一种高效攻击方法,能从梯度向量中恢复私有训练图像。这种攻击对有敏感数据的分布式学习客户端构成重大隐私挑战,因为客户端需共享梯度。防御此类攻击需要但缺乏对隐私泄漏何时以及如何发生的理解,这主要源于深度网络的黑箱特性。本文提出一种新颖的反演影响函数(I$^2$F),通过隐式求解DGL问题,在恢复图像与私有梯度之间建立封闭形式的联系。与直接求解DGL相比,I$^2$F可扩展用于分析深度网络,仅需对梯度和雅可比-向量积进行预言机访问。我们通过实验证明,I$^2$F能在不同模型架构、数据集、攻击实现及基于噪声的防御下有效近似DGL。借助这一新工具,我们深入分析了有效的梯度扰动方向、隐私保护的不公平性以及隐私优先的模型初始化方法。我们的代码已在 https://github.com/illidanlab/inversion-influence-function 提供。