Deep neural networks have been applied in many computer vision tasks and achieved state-of-the-art performance. However, misclassification will occur when DNN predicts adversarial examples which add human-imperceptible adversarial noise to natural examples. This limits the application of DNN in security-critical fields. To alleviate this problem, we first conducted an empirical analysis of the latent features of both adversarial and natural examples and found the similarity matrix of natural examples is more compact than those of adversarial examples. Motivated by this observation, we propose \textbf{L}atent \textbf{F}eature \textbf{R}elation \textbf{C}onsistency (\textbf{LFRC}), which constrains the relation of adversarial examples in latent space to be consistent with the natural examples. Importantly, our LFRC is orthogonal to the previous method and can be easily combined with them to achieve further improvement. To demonstrate the effectiveness of LFRC, we conduct extensive experiments using different neural networks on benchmark datasets. For instance, LFRC can bring 0.78\% further improvement compared to AT, and 1.09\% improvement compared to TRADES, against AutoAttack on CIFAR10. Code is available at https://github.com/liuxingbin/LFRC.
翻译:深度神经网络已广泛应用于众多计算机视觉任务并取得了最先进的性能。然而,当深度神经网络预测对抗样本时(即在自然样本中添加人眼无法察觉的对抗噪声),会出现分类错误。这限制了DNN在安全关键领域的应用。为缓解这一问题,我们首先对抗样本和自然样本的潜在特征进行了实证分析,发现自然样本的相似度矩阵比对抗样本更紧凑。受此观察启发,我们提出了潜在特征关系一致性(LFRC),该方法约束对抗样本在潜在空间中的关系与自然样本保持一致。重要的是,我们的LFRC与先前方法正交,且可轻松与之结合以实现进一步改进。为展示LFRC的有效性,我们采用不同神经网络在基准数据集上进行了大量实验。例如,在CIFAR10数据集上针对AutoAttack攻击,LFRC相比AT可带来0.78%的进一步提升,相比TRADES可带来1.09%的提升。代码开源在https://github.com/liuxingbin/LFRC。