Boosting Adversarial Attacks by Leveraging Decision Boundary Information

Due to the gap between a substitute model and a victim model, the gradient-based noise generated from a substitute model may have low transferability for a victim model since their gradients are different. Inspired by the fact that the decision boundaries of different models do not differ much, we conduct experiments and discover that the gradients of different models are more similar on the decision boundary than in the original position. Moreover, since the decision boundary in the vicinity of an input image is flat along most directions, we conjecture that the boundary gradients can help find an effective direction to cross the decision boundary of the victim models. Based on it, we propose a Boundary Fitting Attack to improve transferability. Specifically, we introduce a method to obtain a set of boundary points and leverage the gradient information of these points to update the adversarial examples. Notably, our method can be combined with existing gradient-based methods. Extensive experiments prove the effectiveness of our method, i.e., improving the success rate by 5.6% against normally trained CNNs and 14.9% against defense CNNs on average compared to state-of-the-art transfer-based attacks. Further we compare transformers with CNNs, the results indicate that transformers are more robust than CNNs. However, our method still outperforms existing methods when attacking transformers. Specifically, when using CNNs as substitute models, our method obtains an average attack success rate of 58.2%, which is 10.8% higher than other state-of-the-art transfer-based attacks.

翻译：由于替代模型与受害模型之间存在差距，从替代模型生成的基于梯度的噪声可能对受害模型具有较低的迁移性，因为它们的梯度不同。受不同模型决策边界差异不大的事实启发，我们进行实验并发现，在决策边界上，不同模型的梯度比在原始位置上更相似。此外，由于输入图像附近的决策边界沿大多数方向是平坦的，我们推测边界梯度有助于找到有效方向，以穿越受害模型的决策边界。基于此，我们提出了一种边界拟合攻击（Boundary Fitting Attack）以提高迁移性。具体而言，我们引入了一种获取一组边界点的方法，并利用这些点的梯度信息来更新对抗样本。值得注意的是，我们的方法可以与现有的基于梯度的方法相结合。大量实验证明了我们方法的有效性，即与最先进的基于迁移的攻击相比，针对正常训练的CNN平均提高了5.6%的成功率，针对防御CNN平均提高了14.9%。进一步地，我们比较了Transformer与CNN，结果表明Transformer比CNN更具鲁棒性。然而，在攻击Transformer时，我们的方法仍然优于现有方法。具体而言，当使用CNN作为替代模型时，我们的方法获得了58.2%的平均攻击成功率，比其他最先进的基于迁移的攻击高出10.8%。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日