Generating Transferable and Stealthy Adversarial Patch via Attention-guided Adversarial Inpainting

Adversarial patch attacks can fool the face recognition (FR) models via small patches. However, previous adversarial patch attacks often result in unnatural patterns that are easily noticeable. Generating transferable and stealthy adversarial patches that can efficiently deceive the black-box FR models while having good camouflage is challenging because of the huge stylistic difference between the source and target images. To generate transferable, natural-looking, and stealthy adversarial patches, we propose an innovative two-stage attack called Adv-Inpainting, which extracts style features and identity features from the attacker and target faces, respectively and then fills the patches with misleading and inconspicuous content guided by attention maps. In the first stage, we extract multi-scale style embeddings by a pyramid-like network and identity embeddings by a pretrained FR model and propose a novel Attention-guided Adaptive Instance Normalization layer (AAIN) to merge them via background-patch cross-attention maps. The proposed layer can adaptively fuse identity and style embeddings by fully exploiting priority contextual information. In the second stage, we design an Adversarial Patch Refinement Network (APR-Net) with a novel boundary variance loss, a spatial discounted reconstruction loss, and a perceptual loss to boost the stealthiness further. Experiments demonstrate that our attack can generate adversarial patches with improved visual quality, better stealthiness, and stronger transferability than state-of-the-art adversarial patch attacks and semantic attacks.

翻译：对抗性补丁攻击可通过微小补丁欺骗人脸识别（FR）模型。然而，现有对抗性补丁攻击常产生易被察觉的非自然图案。由于源图像与目标图像存在巨大风格差异，生成既能有效欺骗黑盒FR模型又具备良好伪装的可迁移且隐蔽对抗补丁颇具挑战性。为生成可迁移、自然且隐蔽的对抗补丁，我们提出创新的两阶段攻击方法Adv-Inpainting：该方法分别从攻击者人脸和目标人脸上提取风格特征与身份特征，并通过注意力图引导，以具有误导性且不显眼的内容填充补丁区域。第一阶段中，我们利用金字塔式网络提取多尺度风格嵌入，借助预训练FR模型提取身份嵌入，并设计新型注意力引导自适应实例归一化层（AAIN），通过背景-补丁交叉注意力图融合两类嵌入。该层通过充分利用优先上下文信息实现身份与风格嵌入的自适应融合。第二阶段中，我们设计对抗补丁优化网络（APR-Net），通过新型边界方差损失、空间折扣重建损失和感知损失进一步增强隐蔽性。实验表明，与最先进的对抗性补丁攻击和语义攻击相比，我们的攻击方法可生成视觉质量更优、隐蔽性更强、可迁移性更好的对抗补丁。