Adversarial attacks have been proven to be potential threats to Deep Neural Networks (DNNs), and many methods are proposed to defend against adversarial attacks. However, while enhancing the robustness, the clean accuracy will decline to a certain extent, implying a trade-off existed between the accuracy and robustness. In this paper, we firstly empirically find an obvious distinction between standard and robust models in the filters' weight distribution of the same architecture, and then theoretically explain this phenomenon in terms of the gradient regularization, which shows this difference is an intrinsic property for DNNs, and thus a static network architecture is difficult to improve the accuracy and robustness at the same time. Secondly, based on this observation, we propose a sample-wise dynamic network architecture named Adversarial Weight-Varied Network (AW-Net), which focuses on dealing with clean and adversarial examples with a ``divide and rule" weight strategy. The AW-Net dynamically adjusts network's weights based on regulation signals generated by an adversarial detector, which is directly influenced by the input sample. Benefiting from the dynamic network architecture, clean and adversarial examples can be processed with different network weights, which provides the potentiality to enhance the accuracy and robustness simultaneously. A series of experiments demonstrate that our AW-Net is architecture-friendly to handle both clean and adversarial examples and can achieve better trade-off performance than state-of-the-art robust models.
翻译:对抗攻击已被证明是对深度神经网络(DNN)的潜在威胁,目前已有多种方法被提出用于防御此类攻击。然而,在增强鲁棒性的同时,干净样本的准确率会一定程度下降,这表明准确率与鲁棒性之间存在权衡关系。本文首先通过实验发现,在相同架构下,标准模型与鲁棒模型的滤波器权重分布存在显著差异,随后从梯度正则化角度对这一现象进行理论解释,揭示该差异是DNN的固有属性,因此静态网络架构难以同时提升准确率与鲁棒性。其次,基于这一发现,我们提出一种样本自适应的动态网络架构——对抗权重变异网络(AW-Net),该网络采用"分而治之"的权重策略分别处理干净样本与对抗样本。AW-Net通过对抗检测器生成的调节信号动态调整网络权重,该调节信号直接受输入样本影响。得益于动态网络架构,干净样本与对抗样本可被分配不同的网络权重,从而为同时提升准确率与鲁棒性提供了可能。系列实验表明,我们的AW-Net具有良好的架构兼容性,能有效处理干净与对抗样本,并实现优于现有最优鲁棒模型的权衡性能。