The recently proposed facial cloaking attacks add invisible perturbation (cloaks) to facial images to protect users from being recognized by unauthorized facial recognition models. However, we show that the "cloaks" are not robust enough and can be removed from images. This paper introduces PuFace, an image purification system leveraging the generalization ability of neural networks to diminish the impact of cloaks by pushing the cloaked images towards the manifold of natural (uncloaked) images before the training process of facial recognition models. Specifically, we devise a purifier that takes all the training images including both cloaked and natural images as input and generates the purified facial images close to the manifold where natural images lie. To meet the defense goal, we propose to train the purifier on particularly amplified cloaked images with a loss function that combines image loss and feature loss. Our empirical experiment shows PuFace can effectively defend against two state-of-the-art facial cloaking attacks and reduces the attack success rate from 69.84\% to 7.61\% on average without degrading the normal accuracy for various facial recognition models. Moreover, PuFace is a model-agnostic defense mechanism that can be applied to any facial recognition model without modifying the model structure.
翻译:最近提出的面部伪装攻击通过在面部图像上添加不可见的扰动(伪装)来保护用户免受未经授权的人脸识别模型的识别。然而,我们发现这些“伪装”的鲁棒性不足,可以从图像中去除。本文提出PuFace,一种利用神经网络泛化能力的图像净化系统,通过在人脸识别模型训练前将伪装图像推向自然(未伪装)图像的流形来削弱伪装的影响。具体而言,我们设计了一个净化器,以所有训练图像(包括伪装图像和自然图像)作为输入,并生成接近自然图像所在流形的净化面部图像。为实现防御目标,我们提出在特别增强的伪装图像上训练净化器,并采用结合图像损失和特征损失的损失函数。实验结果表明,PuFace能有效防御两种最先进的面部伪装攻击,平均将攻击成功率从69.84%降低至7.61%,且不影响各类人脸识别模型的正常识别准确率。此外,PuFace是一种与模型无关的防御机制,无需修改模型结构即可应用于任何人脸识别模型。