Pseudo Label-Guided Model Inversion Attack via Conditional Generative Adversarial Network

Model inversion (MI) attacks have raised increasing concerns about privacy, which can reconstruct training data from public models. Indeed, MI attacks can be formalized as an optimization problem that seeks private data in a certain space. Recent MI attacks leverage a generative adversarial network (GAN) as an image prior to narrow the search space, and can successfully reconstruct even the high-dimensional data (e.g., face images). However, these generative MI attacks do not fully exploit the potential capabilities of the target model, still leading to a vague and coupled search space, i.e., different classes of images are coupled in the search space. Besides, the widely used cross-entropy loss in these attacks suffers from gradient vanishing. To address these problems, we propose Pseudo Label-Guided MI (PLG-MI) attack via conditional GAN (cGAN). At first, a top-n selection strategy is proposed to provide pseudo-labels for public data, and use pseudo-labels to guide the training of the cGAN. In this way, the search space is decoupled for different classes of images. Then a max-margin loss is introduced to improve the search process on the subspace of a target class. Extensive experiments demonstrate that our PLG-MI attack significantly improves the attack success rate and visual quality for various datasets and models, notably, 2~3 $\times$ better than state-of-the-art attacks under large distributional shifts. Our code is available at: https://github.com/LetheSec/PLG-MI-Attack.

翻译：模型反演攻击对隐私问题日益引发关注，其能通过公开模型重建训练数据。实际上，模型反演攻击可被形式化为在特定空间中搜寻私有数据的优化问题。近期模型反演攻击利用生成对抗网络作为图像先验以缩小搜索空间，并成功重建高维数据。然而，这些生成式模型反演攻击未能充分利用目标模型的潜在能力，导致搜索空间依然模糊且耦合，即不同类别的图像在搜索空间中相互纠缠。此外，攻击中广泛使用的交叉熵损失存在梯度消失问题。为解决上述问题，我们提出基于条件生成对抗网络的伪标签引导模型反演攻击。首先，提出一种top-n选择策略为公共数据提供伪标签，并利用伪标签指导条件生成对抗网络的训练。通过这种方式，不同类别图像的搜索空间得以解耦。随后引入最大间隔损失以改进目标类别子空间上的搜索过程。大量实验表明，所提出的伪标签引导模型反演攻击显著提升了各数据集和模型的攻击成功率与视觉效果，特别是在分布偏移较大的场景下，其性能较现有最优攻击方法提升2~3倍。代码已开源在：https://github.com/LetheSec/PLG-MI-Attack。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】可转移的稀疏对抗性攻击，Transferable Sparse Adversarial Attack

专知会员服务

15+阅读 · 2022年3月12日

近期必读的六篇AAAI 2021【对抗攻击（Adversarial Attack）】相关论文和代码

专知会员服务

55+阅读 · 2021年2月17日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

专知会员服务

24+阅读 · 2020年1月8日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation