An Extended Study of Human-like Behavior under Adversarial Training

Neural networks have a number of shortcomings. Amongst the severest ones is the sensitivity to distribution shifts which allows models to be easily fooled into wrong predictions by small perturbations to inputs that are often imperceivable to humans and do not have to carry semantic meaning. Adversarial training poses a partial solution to address this issue by training models on worst-case perturbations. Yet, recent work has also pointed out that the reasoning in neural networks is different from humans. Humans identify objects by shape, while neural nets mainly employ texture cues. Exemplarily, a model trained on photographs will likely fail to generalize to datasets containing sketches. Interestingly, it was also shown that adversarial training seems to favorably increase the shift toward shape bias. In this work, we revisit this observation and provide an extensive analysis of this effect on various architectures, the common $\ell_2$- and $\ell_\infty$-training, and Transformer-based models. Further, we provide a possible explanation for this phenomenon from a frequency perspective.

翻译：神经网络存在若干缺陷，其中最严重的问题之一是对分布偏移的敏感性——输入中微小的扰动（往往人类无法察觉且不携带语义信息）便可轻易误导模型做出错误预测。对抗训练通过让模型对抗最坏情况下的扰动进行训练，为这一问题提供了部分解决方案。然而，近期研究也指出神经网络的推理机制与人类存在差异：人类通过形状识别物体，而神经网络主要依赖纹理线索。例如，在照片上训练的模型往往难以泛化到包含简笔画的数集中。有趣的是，已有研究显示对抗训练似乎能促使模型向形状偏好的方向转移。本研究重新审视了这一现象，并在多种架构、常见的$\ell_2$-范数和$\ell_\infty$-范数训练以及基于Transformer的模型上对此效应进行了深入分析。此外，我们从频率域视角为该现象提供了可能的解释。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【斯坦福&Facebook】生成式对抗变换器，Generative Adversarial Transformers

专知会员服务

21+阅读 · 2021年4月21日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

46+阅读 · 2020年10月31日

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日