Recent inversion methods have shown that real images can be inverted into StyleGAN's latent space and numerous edits can be achieved on those images thanks to the semantically rich feature representations of well-trained GAN models. However, extensive research has also shown that image inversion is challenging due to the trade-off between high-fidelity reconstruction and editability. In this paper, we tackle an even more difficult task, inverting erased images into GAN's latent space for realistic inpaintings and editings. Furthermore, by augmenting inverted latent codes with different latent samples, we achieve diverse inpaintings. Specifically, we propose to learn an encoder and mixing network to combine encoded features from erased images with StyleGAN's mapped features from random samples. To encourage the mixing network to utilize both inputs, we train the networks with generated data via a novel set-up. We also utilize higher-rate features to prevent color inconsistencies between the inpainted and unerased parts. We run extensive experiments and compare our method with state-of-the-art inversion and inpainting methods. Qualitative metrics and visual comparisons show significant improvements.
翻译:近期反转方法表明,真实图像可被映射至StyleGAN的潜空间,借助训练完备的生成对抗网络(GAN)模型丰富的语义特征表征,可在这些图像上实现多种编辑操作。然而大量研究也揭示,由于高保真重建与可编辑性之间存在权衡,图像反演颇具挑战性。本文针对更具难度的任务——将擦除图像反演至GAN潜空间以实现逼真修复与编辑。此外,通过用不同潜样本增强反演后的潜编码,我们实现了多样化修复效果。具体而言,我们提出学习编码器与混合网络,将擦除图像的编码特征与随机样本的StyleGAN映射特征相融合。为促使混合网络充分利用两种输入,我们通过创新设置利用生成数据训练网络。同时采用高分辨率特征以防止修复区域与未擦除区域间的色彩不一致。通过大量实验,我们将本方法与现有最优反演及修复方法进行对比。定性指标与视觉比较均表明本方法具有显著优势。