High Resolution Face Editing with Masked GAN Latent Code Optimization

from arxiv, Final ArXiv version. The paper has been accepted for publication in IEEE Transactions on Image Processing journal and will be published in 2023

Face editing represents a popular research topic within the computer vision and image processing communities. While significant progress has been made recently in this area, existing solutions: (i) are still largely focused on low-resolution images, (ii) often generate editing results with visual artefacts, or (iii) lack fine-grained control and alter multiple (entangled) attributes at once, when trying to generate the desired facial semantics. In this paper, we aim to address these issues though a novel attribute editing approach called MaskFaceGAN that focuses on local attribute editing. The proposed approach is based on an optimization procedure that directly optimizes the latent code of a pre-trained (state-of-the-art) Generative Adversarial Network (i.e., StyleGAN2) with respect to several constraints that ensure: (i) preservation of relevant image content, (ii) generation of the targeted facial attributes, and (iii) spatially--selective treatment of local image areas. The constraints are enforced with the help of an (differentiable) attribute classifier and face parser that provide the necessary reference information for the optimization procedure. MaskFaceGAN is evaluated in extensive experiments on the CelebA-HQ, Helen and SiblingsDB-HQf datasets and in comparison with several state-of-the-art techniques from the literature, i.e., StarGAN, AttGAN, STGAN, and two versions of InterFaceGAN. Our experimental results show that the proposed approach is able to edit face images with respect to several local facial attributes with unprecedented image quality and at high-resolutions (1024x1024), while exhibiting considerably less problems with attribute entanglement than competing solutions. The source code is made freely available from: https://github.com/MartinPernus/MaskFaceGAN.

翻译：人脸编辑是计算机视觉和图像处理领域中的一个热门研究课题。尽管近年来该领域取得了显著进展，但现有解决方案仍存在以下问题：(i) 主要集中于低分辨率图像，(ii) 生成的编辑结果常带有视觉伪影，或(iii) 在尝试生成所需人脸语义时缺乏精细控制，导致多个（纠缠的）属性被同时改变。本文旨在通过一种名为MaskFaceGAN的新型属性编辑方法解决这些问题，该方法专注于局部属性编辑。所提出的方法基于一个优化过程，该过程直接优化预训练（最新）生成对抗网络（即StyleGAN2）的潜在码，同时满足多个约束条件，以确保：(i) 保留相关图像内容，(ii) 生成目标人脸属性，以及(iii) 对局部图像区域进行空间选择性处理。这些约束借助（可微的）属性分类器和人脸解析器来强制执行，为优化过程提供必要的参考信息。我们在CelebA-HQ、Helen和SiblingsDB-HQf数据集上进行了大量实验，并将MaskFaceGAN与文献中的几种最新技术（即StarGAN、AttGAN、STGAN以及两个版本的InterFaceGAN）进行了比较。我们的实验结果表明，所提出的方法能够以前所未有的图像质量和高分辨率（1024x1024）编辑人脸图像的多个局部属性，同时在属性纠缠问题上比竞争方案表现出更少的缺陷。源代码可从以下地址免费获取：https://github.com/MartinPernus/MaskFaceGAN。