Adversarial attacks have been proven to be potential threats to Deep Neural Networks (DNNs), and many methods are proposed to defend against adversarial attacks. However, while enhancing the robustness, the clean accuracy will decline to a certain extent, implying a trade-off existed between the accuracy and robustness. In this paper, we firstly empirically find an obvious distinction between standard and robust models in the filters' weight distribution of the same architecture, and then theoretically explain this phenomenon in terms of the gradient regularization, which shows this difference is an intrinsic property for DNNs, and thus a static network architecture is difficult to improve the accuracy and robustness at the same time. Secondly, based on this observation, we propose a sample-wise dynamic network architecture named Adversarial Weight-Varied Network (AW-Net), which focuses on dealing with clean and adversarial examples with a ``divide and rule" weight strategy. The AW-Net dynamically adjusts network's weights based on regulation signals generated by an adversarial detector, which is directly influenced by the input sample. Benefiting from the dynamic network architecture, clean and adversarial examples can be processed with different network weights, which provides the potentiality to enhance the accuracy and robustness simultaneously. A series of experiments demonstrate that our AW-Net is architecture-friendly to handle both clean and adversarial examples and can achieve better trade-off performance than state-of-the-art robust models.
翻译:对抗性攻击已被证明是对深度神经网络(DNNs)的潜在威胁,为此提出了许多防御方法。然而,在增强鲁棒性的同时,干净样本的准确性会一定程度下降,这表明准确性与鲁棒性之间存在权衡。本文首先通过实验发现,相同架构下标准模型与鲁棒模型在滤波器的权重分布上存在明显差异,随后从梯度正则化角度对这一现象进行了理论解释,表明这种差异是DNNs的内在属性,因此静态网络架构难以同时提升准确性与鲁棒性。其次,基于这一观察,我们提出了一种样本自适应的动态网络架构——对抗性权重变化网络(AW-Net),其核心思想是通过“分而治之”的权重策略分别处理干净样本与对抗样本。AW-Net根据对抗检测器产生的调节信号动态调整网络权重,该信号直接受输入样本影响。得益于动态网络架构,干净样本与对抗样本可采用不同的网络权重进行处理,从而为同时提升准确性与鲁棒性提供了可能。一系列实验表明,我们的AW-Net具有架构友好性,能够有效处理干净样本与对抗样本,相比现有最先进的鲁棒模型可实现更优的权衡性能。