A Geometrical Approach to Evaluate the Adversarial Robustness of Deep Neural Networks

Deep Neural Networks (DNNs) are widely used for computer vision tasks. However, it has been shown that deep models are vulnerable to adversarial attacks, i.e., their performances drop when imperceptible perturbations are made to the original inputs, which may further degrade the following visual tasks or introduce new problems such as data and privacy security. Hence, metrics for evaluating the robustness of deep models against adversarial attacks are desired. However, previous metrics are mainly proposed for evaluating the adversarial robustness of shallow networks on the small-scale datasets. Although the Cross Lipschitz Extreme Value for nEtwork Robustness (CLEVER) metric has been proposed for large-scale datasets (e.g., the ImageNet dataset), it is computationally expensive and its performance relies on a tractable number of samples. In this paper, we propose the Adversarial Converging Time Score (ACTS), an attack-dependent metric that quantifies the adversarial robustness of a DNN on a specific input. Our key observation is that local neighborhoods on a DNN's output surface would have different shapes given different inputs. Hence, given different inputs, it requires different time for converging to an adversarial sample. Based on this geometry meaning, ACTS measures the converging time as an adversarial robustness metric. We validate the effectiveness and generalization of the proposed ACTS metric against different adversarial attacks on the large-scale ImageNet dataset using state-of-the-art deep networks. Extensive experiments show that our ACTS metric is an efficient and effective adversarial metric over the previous CLEVER metric.

翻译：深度神经网络（DNNs）已广泛应用于计算机视觉任务。然而，研究表明深度模型易受对抗攻击，即当原始输入被施加难以察觉的扰动时，其性能会下降，这可能会进一步降低后续视觉任务的性能，或引发数据与隐私安全等新问题。因此，需要评估深度模型对抗攻击鲁棒性的指标。然而，现有指标主要针对小规模数据集上的浅层网络对抗鲁棒性评估提出。尽管面向大规模数据集（如ImageNet数据集）的跨Lipschitz网络鲁棒性极值（CLEVER）指标已被提出，但其计算成本高，且性能依赖于可处理的样本数量。本文提出一种攻击依赖性指标——对抗收敛时间分数（ACTS），用于量化深度神经网络在特定输入上的对抗鲁棒性。我们的关键观察是：深度神经网络输出曲面上的局部邻域在不同输入下会呈现不同形状。因此，针对不同输入，收敛到对抗样本所需的时间也不同。基于这一几何意义，ACTS将收敛时间作为对抗鲁棒性指标进行度量。我们利用最先进的深度网络在大规模ImageNet数据集上验证了所提ACTS指标针对不同对抗攻击的有效性与泛化能力。大量实验表明，我们的ACTS指标相较于之前的CLEVER指标是一种高效且有效的对抗性度量指标。