A majority of existing physical attacks in the real world result in conspicuous and eye-catching patterns for generated patches, which made them identifiable/detectable by humans. To overcome this limitation, recent work has proposed several approaches that aim at generating naturalistic patches using generative adversarial networks (GANs), which may not catch human's attention. However, these approaches are computationally intensive and do not always converge to natural looking patterns. In this paper, we propose a novel lightweight framework that systematically generates naturalistic adversarial patches without using GANs. To illustrate the proposed approach, we generate adversarial art (AdvART), which are patches generated to look like artistic paintings while maintaining high attack efficiency. In fact, we redefine the optimization problem by introducing a new similarity objective. Specifically, we leverage similarity metrics to construct a similarity loss that is added to the optimized objective function. This component guides the patch to follow a predefined artistic patterns while maximizing the victim model's loss function. Our patch achieves high success rates with $12.53\%$ mean average precision (mAP) on YOLOv4tiny for INRIA dataset.
翻译:现有大多数现实世界中的物理攻击生成的补丁往往会形成显眼且引人注目的图案,从而容易被人类识别或检测。为克服这一局限,近期研究提出了一些方法,旨在利用生成对抗网络(GAN)生成不易引起人类注意的自然形态补丁。然而,这些方法计算开销大,且无法始终收敛至自然外观图案。本文提出一种新颖的轻量级框架,无需使用GAN即可系统性地生成自然形态的对抗补丁。为阐述该方法,我们生成了对抗艺术(AdvART)——一种在保持高攻击效率的同时模拟绘画风格的对抗补丁。具体而言,我们通过引入新的相似性目标重新定义了优化问题:利用相似性度量构建相似性损失函数并加入优化目标,该组件在最大化受害模型损失函数的同时,引导补丁遵循预定义的艺术图案。在INRIA数据集上,我们的补丁对YOLOv4tiny实现了12.53%的平均精度均值(mAP),取得了较高的攻击成功率。