The vulnerability of deep neural networks to adversarial perturbations has been widely perceived in the computer vision community. From a security perspective, it poses a critical risk for modern vision systems, e.g., the popular Deep Learning as a Service (DLaaS) frameworks. For protecting deep models while not modifying them, current algorithms typically detect adversarial patterns through discriminative decomposition for natural and adversarial data. However, these decompositions are either biased towards frequency resolution or spatial resolution, thus failing to capture adversarial patterns comprehensively. Also, when the detector relies on few fixed features, it is practical for an adversary to fool the model while evading the detector (i.e., defense-aware attack). Motivated by such facts, we propose a discriminative detector relying on a spatial-frequency Krawtchouk decomposition. It expands the above works from two aspects: 1) the introduced Krawtchouk basis provides better spatial-frequency discriminability, capturing the differences between natural and adversarial data comprehensively in both spatial and frequency distributions, w.r.t. the common trigonometric or wavelet basis; 2) the extensive features formed by the Krawtchouk decomposition allows for adaptive feature selection and secrecy mechanism, significantly increasing the difficulty of the defense-aware attack, w.r.t. the detector with few fixed features. Theoretical and numerical analyses demonstrate the uniqueness and usefulness of our detector, exhibiting competitive scores on several deep models and image sets against a variety of adversarial attacks.
翻译:深度神经网络对对抗性扰动的脆弱性已在计算机视觉领域被广泛认知。从安全角度来看,这对现代视觉系统(例如流行的深度学习即服务框架)构成了关键风险。为了保护深度模型而不对其进行修改,当前算法通常通过对自然数据和对抗性数据进行判别性分解来检测对抗性模式。然而,这些分解要么偏向于频率分辨率,要么偏向于空间分辨率,因此无法全面捕捉对抗性模式。此外,当检测器依赖少数固定特征时,攻击者有可能在欺骗模型的同时规避检测器(即防御感知攻击)。基于这些事实,我们提出了一种依赖空间-频率Krawtchouk分解的判别性检测器。该检测器从两个方面拓展了上述工作:1)引入的Krawtchouk基提供了更好的空间-频率可区分性,相较于常见的三角函数基或小波基,能更全面地捕捉自然数据与对抗性数据在空间和频率分布上的差异;2)Krawtchouk分解形成的大量特征允许进行自适应特征选择和保密机制,相较于依赖少数固定特征的检测器,显著增加了防御感知攻击的难度。理论和数值分析证明了我们检测器的独特性和有效性,在多种深度模型和图像集上针对各类对抗性攻击均展现出具有竞争力的评分。