Boosting the Adversarial Transferability of Surrogate Models with Dark Knowledge

Deep neural networks (DNNs) are vulnerable to adversarial examples. And, the adversarial examples have transferability, which means that an adversarial example for a DNN model can fool another model with a non-trivial probability. This gave birth to the transfer-based attack where the adversarial examples generated by a surrogate model are used to conduct black-box attacks. There are some work on generating the adversarial examples from a given surrogate model with better transferability. However, training a special surrogate model to generate adversarial examples with better transferability is relatively under-explored. This paper proposes a method for training a surrogate model with dark knowledge to boost the transferability of the adversarial examples generated by the surrogate model. This trained surrogate model is named dark surrogate model (DSM). The proposed method for training a DSM consists of two key components: a teacher model extracting dark knowledge, and the mixing augmentation skill enhancing dark knowledge of training data. We conducted extensive experiments to show that the proposed method can substantially improve the adversarial transferability of surrogate models across different architectures of surrogate models and optimizers for generating adversarial examples, and it can be applied to other scenarios of transfer-based attack that contain dark knowledge, like face verification. Our code is publicly available at \url{https://github.com/ydc123/Dark_Surrogate_Model}.

翻译：深度神经网络（DNN）对对抗样本具有脆弱性。此外，对抗样本具有迁移性，即针对某个DNN模型生成的对抗样本能够以非平凡的概率欺骗另一个模型。这催生了基于迁移的攻击方法，即利用代理模型生成的对抗样本实施黑盒攻击。目前已有部分工作致力于从给定代理模型生成具有更好迁移性的对抗样本，但如何训练一个专门用于生成高迁移性对抗样本的代理模型仍相对缺乏研究。本文提出一种利用暗知识训练代理模型的方法，以提升该代理模型生成对抗样本的迁移性。该训练后的代理模型被称为暗代理模型（DSM）。训练DSM的方法包含两个关键组成部分：提取暗知识的教师模型，以及增强训练数据暗知识的混合增强技术。我们通过大量实验证明，该方法能够显著提升不同架构代理模型及不同优化器生成的对抗样本的迁移性，并可应用于包含暗知识的其他迁移攻击场景（如人脸验证）。我们的代码已开源在\url{https://github.com/ydc123/Dark_Surrogate_Model}。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日