Universal adversarial perturbation (UAP), also known as image-agnostic perturbation, is a fixed perturbation map that can fool the classifier with high probabilities on arbitrary images, making it more practical for attacking deep models in the real world. Previous UAP methods generate a scale-fixed and texture-fixed perturbation map for all images, which ignores the multi-scale objects in images and usually results in a low fooling ratio. Since the widely used convolution neural networks tend to classify objects according to semantic information stored in local textures, it seems a reasonable and intuitive way to improve the UAP from the perspective of utilizing local contents effectively. In this work, we find that the fooling ratios significantly increase when we add a constraint to encourage a small-scale UAP map and repeat it vertically and horizontally to fill the whole image domain. To this end, we propose texture scale-constrained UAP (TSC-UAP), a simple yet effective UAP enhancement method that automatically generates UAPs with category-specific local textures that can fool deep models more easily. Through a low-cost operation that restricts the texture scale, TSC-UAP achieves a considerable improvement in the fooling ratio and attack transferability for both data-dependent and data-free UAP methods. Experiments conducted on two state-of-the-art UAP methods, eight popular CNN models and four classical datasets show the remarkable performance of TSC-UAP.
翻译:通用对抗扰动(UAP),也称为图像无关扰动,是一种固定的扰动图,能够以高概率欺骗分类器对任意图像的识别,使其在实际场景中攻击深度模型更具实用性。以往的UAP方法为所有图像生成尺度固定且纹理固定的扰动图,忽略了图像中的多尺度目标,通常导致较低的欺骗率。由于广泛使用的卷积神经网络倾向于根据局部纹理中存储的语义信息对目标进行分类,因此从有效利用局部内容的角度改进UAP是一种合理且直观的方法。本研究发现,当添加约束以鼓励生成小尺度UAP图,并沿垂直和水平方向重复填充整个图像域时,欺骗率显著提升。为此,我们提出纹理尺度约束通用对抗扰动(TSC-UAP),这是一种简单而有效的UAP增强方法,能够自动生成具有类别特定局部纹理的UAP,从而更容易欺骗深度模型。通过限制纹理尺度的低成本操作,TSC-UAP在数据依赖型和数据无关型UAP方法中均实现了欺骗率和攻击可迁移性的显著提升。在两种最先进的UAP方法、八种主流CNN模型和四个经典数据集上进行的实验展示了TSC-UAP的卓越性能。