Measuring the Transferability of $\ell_\infty$ Attacks by the $\ell_2$ Norm

Deep neural networks could be fooled by adversarial examples with trivial differences to original samples. To keep the difference imperceptible in human eyes, researchers bound the adversarial perturbations by the $\ell_\infty$ norm, which is now commonly served as the standard to align the strength of different attacks for a fair comparison. However, we propose that using the $\ell_\infty$ norm alone is not sufficient in measuring the attack strength, because even with a fixed $\ell_\infty$ distance, the $\ell_2$ distance also greatly affects the attack transferability between models. Through the discovery, we reach more in-depth understandings towards the attack mechanism, i.e., several existing methods attack black-box models better partly because they craft perturbations with 70% to 130% larger $\ell_2$ distances. Since larger perturbations naturally lead to better transferability, we thereby advocate that the strength of attacks should be simultaneously measured by both the $\ell_\infty$ and $\ell_2$ norm. Our proposal is firmly supported by extensive experiments on ImageNet dataset from 7 attacks, 4 white-box models, and 9 black-box models.

翻译：深度神经网络可能被与原始样本存在微小差异的对抗样本所欺骗。为保持人眼不可察觉的差异，研究者通常用$\ell_\infty$范数约束对抗扰动，这一标准目前普遍用于对齐不同攻击的强度以进行公平比较。然而，我们认为仅使用$\ell_\infty$范数不足以衡量攻击强度，因为即便固定$\ell_\infty$距离，$\ell_2$距离也会显著影响攻击在模型间的可迁移性。通过这一发现，我们得以更深入地理解攻击机制：例如，某些现有方法能更好地攻击黑盒模型，部分原因在于其生成的扰动具有比常规大70%至130%的$\ell_2$距离。由于更大的扰动自然带来更强的可迁移性，我们因而主张应同时使用$\ell_\infty$和$\ell_2$范数来度量攻击强度。该主张得到了基于ImageNet数据集的广泛实验支撑，实验涵盖7种攻击方法、4个白盒模型以及9个黑盒模型。

相关内容

MoDELS

关注 46

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日