Recent research has shown that Deep Neural Networks (DNNs) are highly vulnerable to adversarial samples, which are highly transferable and can be used to attack other unknown black-box models. To improve the transferability of adversarial samples, several feature-based adversarial attack methods have been proposed to disrupt neuron activation in the middle layers. However, current state-of-the-art feature-based attack methods typically require additional computation costs for estimating the importance of neurons. To address this challenge, we propose a Singular Value Decomposition (SVD)-based feature-level attack method. Our approach is inspired by the discovery that eigenvectors associated with the larger singular values decomposed from the middle layer features exhibit superior generalization and attention properties. Specifically, we conduct the attack by retaining the decomposed Top-1 singular value-associated feature for computing the output logits, which are then combined with the original logits to optimize adversarial examples. Our extensive experimental results verify the effectiveness of our proposed method, which can be easily integrated into various baselines to significantly enhance the transferability of adversarial samples for disturbing normally trained CNNs and advanced defense strategies. The source code of this study is available at \textcolor{blue}{\href{https://anonymous.4open.science/r/SVD-SSA-13BF/README.md}{Link}}.
翻译:近期研究表明,深度神经网络(DNNs)极易受到对抗样本的攻击,这些对抗样本具有高迁移性,可用于攻击其他未知的黑盒模型。为提升对抗样本的迁移性,现有多种基于特征的对抗攻击方法通过破坏中间层神经元激活状态来实施攻击。然而,当前最先进的基于特征的攻击方法通常需要额外计算成本来评估神经元的重要性。针对这一挑战,我们提出了一种基于奇异值分解(SVD)的特征级攻击方法。本方法的灵感来源于:从中间层特征中分解出的较大奇异值对应的特征向量展现出更强的泛化性和注意力特性。具体而言,我们通过保留分解后的Top-1奇异值相关特征来计算输出logits,并将其与原始logits相结合来优化对抗样本。大量实验结果验证了本方法的有效性,该方法可轻松集成到各类基线方法中,显著增强对抗样本的迁移性,从而有效干扰常规训练的CNN模型及先进防御策略。本研究的源代码已在以下链接公开:\textcolor{blue}{\href{https://anonymous.4open.science/r/SVD-SSA-13BF/README.md}{Link}}。