Adversarial patch is one of the important forms of performing adversarial attacks in the physical world. To improve the naturalness and aggressiveness of existing adversarial patches, location-aware patches are proposed, where the patch's location on the target object is integrated into the optimization process to perform attacks. Although it is effective, efficiently finding the optimal location for placing the patches is challenging, especially under the black-box attack settings. In this paper, we propose the Distribution-Optimized Adversarial Patch (DOPatch), a novel method that optimizes a multimodal distribution of adversarial locations instead of individual ones. DOPatch has several benefits: Firstly, we find that the locations' distributions across different models are pretty similar, and thus we can achieve efficient query-based attacks to unseen models using a distributional prior optimized on a surrogate model. Secondly, DOPatch can generate diverse adversarial samples by characterizing the distribution of adversarial locations. Thus we can improve the model's robustness to location-aware patches via carefully designed Distributional-Modeling Adversarial Training (DOP-DMAT). We evaluate DOPatch on various face recognition and image recognition tasks and demonstrate its superiority and efficiency over existing methods. We also conduct extensive ablation studies and analyses to validate the effectiveness of our method and provide insights into the distribution of adversarial locations.
翻译:对抗补丁是物理世界中实施对抗攻击的重要形式之一。为提升现有对抗补丁的自然性和攻击性,研究者提出了位置感知补丁,即将补丁在目标对象上的位置信息融入优化过程以实施攻击。尽管该方法有效,但在黑盒攻击场景下高效寻找补丁的最优放置位置仍具挑战性。本文提出分布优化对抗补丁(DOPatch),这是一种优化对抗位置的多模态分布而非单一位置的新方法。DOPatch具备以下优势:首先,我们发现不同模型上的位置分布具有高度相似性,因此可利用在替代模型上优化的分布先验,实现对未见模型的高效查询式攻击。其次,通过刻画对抗位置的分布特征,DOPatch能够生成多样化的对抗样本。由此,我们通过精心设计的分布建模对抗训练(DOP-DMAT)提升模型对位置感知补丁的鲁棒性。我们在多项人脸识别与图像识别任务上对DOPatch进行了评估,结果表明其相较于现有方法具有优越性与高效性。此外,我们通过大量消融实验与分析验证了方法的有效性,并深入揭示了对抗位置的分布规律。