Rethinking Impersonation and Dodging Attacks on Face Recognition Systems

Face Recognition (FR) systems can be easily deceived by adversarial examples that manipulate benign face images through imperceptible perturbations. Adversarial attacks on FR encompass two types: impersonation (targeted) attacks and dodging (untargeted) attacks. Previous methods often achieve a successful impersonation attack on FR; However, it does not necessarily guarantee a successful dodging attack on FR in the black-box setting. In this paper, our key insight is that the generation of adversarial examples should perform both impersonation and dodging attacks simultaneously. To this end, we propose a novel attack method termed as Adversarial Pruning (Adv-Pruning), to fine-tune existing adversarial examples to enhance their dodging capabilities while preserving their impersonation capabilities. Adv-Pruning consists of Priming, Pruning, and Restoration stages. Concretely, we propose Adversarial Priority Quantification to measure the region-wise priority of original adversarial perturbations, identifying and releasing those with minimal impact on absolute model output variances. Then, Biased Gradient Adaptation is presented to adapt the adversarial examples to traverse the decision boundaries of both the attacker and victim by adding perturbations favoring dodging attacks on the vacated regions, preserving the prioritized features of the original perturbations while boosting dodging performance. As a result, we can maintain the impersonation capabilities of original adversarial examples while effectively enhancing dodging capabilities. Comprehensive experiments demonstrate the superiority of our method compared with state-of-the-art adversarial attacks.

翻译：人脸识别（FR）系统易受对抗样本欺骗，这些样本通过不可察觉的扰动操纵良性人脸图像。针对FR的对抗攻击分为两类：模仿攻击（定向攻击）和逃逸攻击（非定向攻击）。现有方法通常能在FR上成功实现模仿攻击，但在黑盒设置下未必能保证逃逸攻击的成功。本文的核心洞见在于，对抗样本生成应同时兼顾模仿攻击与逃逸攻击。为此，我们提出一种名为"对抗剪枝"（Adv-Pruning）的新颖攻击方法，通过微调现有对抗样本，在保留其模仿能力的同时增强逃逸能力。Adv-Pruning包含"启增"、"剪枝"与"重构"三个阶段。具体而言，我们提出对抗优先级量化方法，用于测量原始对抗扰动在区域层面的优先级，识别并释放对模型绝对输出方差影响最小的扰动区域。随后引入偏置梯度适应机制，通过在释放区域内添加有利于逃逸攻击的扰动，引导对抗样本同时穿越攻击者与受害者的决策边界，既保留原始扰动的优先特征，又提升逃逸性能。最终，我们能够在维持原始对抗样本模仿能力的同时，有效增强其逃逸能力。大量实验表明，本方法相较于现有最先进的对抗攻击具有显著优越性。