Random Erasing vs. Model Inversion: A Promising Defense or a False Hope?

Model Inversion (MI) attacks pose a significant privacy threat by reconstructing private training data from machine learning models. While existing defenses primarily concentrate on model-centric approaches, the impact of data on MI robustness remains largely unexplored. In this work, we explore Random Erasing (RE), a technique traditionally used for improving model generalization under occlusion, and uncover its surprising effectiveness as a defense against MI attacks. Specifically, our novel feature space analysis shows that models trained with RE-images introduce a significant discrepancy between the features of MI-reconstructed images and those of the private data. At the same time, features of private images remain distinct from other classes and well-separated from different classification regions. These effects collectively degrade MI reconstruction quality and attack accuracy while maintaining reasonable natural accuracy. Furthermore, we explore two critical properties of RE including Partial Erasure and Random Location. Partial Erasure prevents the model from observing entire objects during training. We find this has a significant impact on MI, which aims to reconstruct the entire objects. Random Location of erasure plays a crucial role in achieving a strong privacy-utility trade-off. Our findings highlight RE as a simple yet effective defense mechanism that can be easily integrated with existing privacy-preserving techniques. Extensive experiments across 37 setups demonstrate that our method achieves state-of-the-art (SOTA) performance in the privacy-utility trade-off. The results consistently demonstrate the superiority of our defense over existing methods across different MI attacks, network architectures, and attack configurations. For the first time, we achieve a significant degradation in attack accuracy without a decrease in utility for some configurations.

翻译：模型反演攻击通过从机器学习模型中重建私有训练数据，构成了严重的隐私威胁。现有防御主要聚焦于以模型为中心的方法，而数据对模型反演鲁棒性的影响仍鲜有探索。本研究探索了随机擦除（RE）——一种传统上用于提升模型在遮挡条件下泛化能力的技术——并揭示了其作为抵御模型反演攻击手段的惊人有效性。具体而言，我们新颖的特征空间分析表明，使用RE图像训练的模型在模型反演重建图像的特征与私有数据特征之间引入了显著差异。同时，私有图像的特征仍与其他类别保持区分性，且与不同分类区域良好分离。这些效应共同降低了模型反演的重建质量与攻击准确率，同时保持了合理的自然准确率。此外，我们探究了RE的两个关键特性，包括部分擦除与随机位置。部分擦除阻止模型在训练期间观察完整对象，我们发现这对旨在重建完整对象的模型反演产生显著影响。擦除的随机位置在实现强隐私-效用权衡中发挥关键作用。我们的发现凸显了RE作为一种简单而有效的防御机制，可轻松与现有隐私保护技术集成。跨37种设置的大量实验表明，我们的方法在隐私-效用权衡中达到了最先进的性能。结果一致证明，我们的防御在不同模型反演攻击、网络架构及攻击配置下均优于现有方法。我们首次在某些配置下实现了攻击准确率的显著降低而效用未下降。

相关内容

MoDELS

关注 46

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

深度学习模型反演攻击与防御：全面综述

专知会员服务

27+阅读 · 2025年2月3日

预训练模型的新兴安全与隐私问题：综述与展望

专知会员服务

20+阅读 · 2024年11月13日

【CVPR2024】持续遗忘对于预训练视觉模型

专知会员服务

19+阅读 · 2024年3月20日