The vulnerability of deep neural networks (DNNs) has been preliminarily verified. Existing black-box adversarial attacks usually require multi-round interaction with the model and consume numerous queries, which is impractical in the real-world and hard to scale to recently emerged Video-LLMs. Moreover, no attack in the video domain directly leverages feature maps to shift the clean-video feature space. We therefore propose FeatureFool, a stealthy, video-domain, zero-query black-box attack that utilizes information extracted from a DNN to alter the feature space of clean videos. Unlike query-based methods that rely on iterative interaction, FeatureFool performs a zero-query attack by directly exploiting DNN-extracted information. This efficient approach is unprecedented in the video domain. Experiments show that FeatureFool achieves an attack success rate above 70\% against traditional video classifiers without any queries. Benefiting from the transferability of the feature map, it can also craft harmful content and bypass Video-LLM recognition. Additionally, adversarial videos generated by FeatureFool exhibit high quality in terms of SSIM, PSNR, and Temporal-Inconsistency, making the attack barely perceptible. This paper may contain violent or explicit content.
翻译:深度神经网络(DNN)的脆弱性已得到初步验证。现有的黑盒对抗攻击通常需要与模型进行多轮交互并消耗大量查询,这在现实世界中不切实际,且难以扩展到近期兴起的视频大语言模型。此外,视频领域尚无攻击方法直接利用特征映射来扰动干净视频的特征空间。为此,我们提出FeatureFool,一种隐蔽的、面向视频领域的零查询黑盒攻击方法,该方法利用从DNN中提取的信息来改变干净视频的特征空间。与依赖迭代交互的基于查询的方法不同,FeatureFool通过直接利用DNN提取的信息执行零查询攻击。这种高效方法在视频领域尚属首次。实验表明,FeatureFool在无需任何查询的情况下,对传统视频分类器的攻击成功率超过70%。得益于特征映射的可迁移性,它还能生成有害内容并绕过视频大语言模型的识别。此外,FeatureFool生成的对抗视频在结构相似性、峰值信噪比和时间不一致性方面均表现出高质量,使得攻击几乎无法被察觉。本文可能包含暴力或露骨内容。