This work addresses the problem of anonymizing the identity of faces in a dataset of images, such that the privacy of those depicted is not violated, while at the same time the dataset is useful for downstream task such as for training machine learning models. To the best of our knowledge, we are the first to explicitly address this issue and deal with two major drawbacks of the existing state-of-the-art approaches, namely that they (i) require the costly training of additional, purpose-trained neural networks, and/or (ii) fail to retain the facial attributes of the original images in the anonymized counterparts, the preservation of which is of paramount importance for their use in downstream tasks. We accordingly present a task-agnostic anonymization procedure that directly optimizes the images' latent representation in the latent space of a pre-trained GAN. By optimizing the latent codes directly, we ensure both that the identity is of a desired distance away from the original (with an identity obfuscation loss), whilst preserving the facial attributes (using a novel feature-matching loss in FaRL's deep feature space). We demonstrate through a series of both qualitative and quantitative experiments that our method is capable of anonymizing the identity of the images whilst -- crucially -- better-preserving the facial attributes. We make the code and the pre-trained models publicly available at: https://github.com/chi0tzp/FALCO.
翻译:本文解决了图像数据集中人脸身份匿名化的问题,即在保护被拍摄者隐私的同时,确保数据集对下游任务(如训练机器学习模型)仍具有实用性。据我们所知,我们首次明确解决此问题,并针对现有主流方法的两大缺陷进行优化,即:(i) 需要额外训练专用神经网络的昂贵成本,和/或 (ii) 无法在匿名化版本中保留原始图像的面部属性,而属性保留对于下游任务的使用至关重要。为此,我们提出了一种任务无关的匿名化流程,直接对预训练GAN潜空间中图像的潜在表示进行优化。通过直接优化隐码,我们既能确保身份与原始图像保持所需距离(借助身份混淆损失),又能保留面部属性(利用FaRL深层特征空间中的新型特征匹配损失)。通过一系列定性与定量实验证明,我们的方法在关键属性保留方面显著优于现有方案。相关代码与预训练模型已公开于:https://github.com/chi0tzp/FALCO。