Face recognition models embed a face image into a low-dimensional identity vector containing abstract encodings of identity-specific facial features that allow individuals to be distinguished from one another. We tackle the challenging task of inverting the latent space of pre-trained face recognition models without full model access (i.e. black-box setting). A variety of methods have been proposed in literature for this task, but they have serious shortcomings such as a lack of realistic outputs, long inference times, and strong requirements for the data set and accessibility of the face recognition model. Through an analysis of the black-box inversion problem, we show that the conditional diffusion model loss naturally emerges and that we can effectively sample from the inverse distribution even without an identity-specific loss. Our method, named identity denoising diffusion probabilistic model (ID3PM), leverages the stochastic nature of the denoising diffusion process to produce high-quality, identity-preserving face images with various backgrounds, lighting, poses, and expressions. We demonstrate state-of-the-art performance in terms of identity preservation and diversity both qualitatively and quantitatively. Our method is the first black-box face recognition model inversion method that offers intuitive control over the generation process and does not suffer from any of the common shortcomings from competing methods.
翻译:人脸识别模型将人脸图像嵌入到低维身份向量中,该向量包含身份特异性面部特征的抽象编码,使得个体之间能够相互区分。我们解决了在无完整模型访问权限(即黑盒设置)的情况下,对预训练人脸识别模型的潜在空间进行反演这一具有挑战性的任务。现有文献中提出了多种方法来解决此问题,但这些方法存在严重缺陷,例如缺乏逼真的输出、推理时间过长,以及对数据集和人脸识别模型的可访问性有强烈要求。通过对黑盒反演问题的分析,我们证明了条件扩散模型损失的自然出现,并且即使没有身份特异性损失,也能有效地从反演分布中采样。我们的方法名为身份去噪扩散概率模型(ID3PM),它利用去噪扩散过程的随机性,生成具有不同背景、光照、姿态和表情的高质量、身份保持的人脸图像。我们在身份保持和多样性方面,无论是定性还是定量分析,均展示了最先进的性能。我们的方法是首个提供对生成过程直观控制,且不受竞争方法常见缺陷影响的黑盒人脸识别模型反演方法。