Recent advancements in real image editing have been attributed to the exploration of Generative Adversarial Networks (GANs) latent space. However, the main challenge of this procedure is GAN inversion, which aims to map the image to the latent space accurately. Existing methods that work on extended latent space $W+$ are unable to achieve low distortion and high editability simultaneously. To address this issue, we propose an approach which works in native latent space $W$ and tunes the generator network to restore missing image details. We introduce a novel regularization strategy with learnable coefficients obtained by training randomized StyleGAN 2 model - WRanGAN. This method outperforms traditional approaches in terms of reconstruction quality and computational efficiency, achieving the lowest distortion with 4 times fewer parameters. Furthermore, we observe a slight improvement in the quality of constructing hyperplanes corresponding to binary image attributes. We demonstrate the effectiveness of our approach on two complex datasets: Flickr-Faces-HQ and LSUN Church.
翻译:近期真实图像编辑领域的进展归因于对生成对抗网络(GANs)潜在空间的探索。然而,该过程的主要挑战在于GAN逆映射,即旨在将图像精确映射到潜在空间。现有基于扩展潜在空间$W+$的方法无法同时实现低失真与高可编辑性。为解决此问题,我们提出一种在原生潜在空间$W$中工作的方法,并通过调整生成器网络来恢复缺失的图像细节。我们引入一种新颖的正则化策略,通过训练随机化StyleGAN 2模型(WRanGAN)获得可学习系数。该方法在重建质量与计算效率方面均优于传统方法,以4倍更少的参数实现最低失真。此外,我们观察到构建对应二值图像属性的超平面质量略有提升。我们在两个复杂数据集上验证了该方法的有效性:Flickr-Faces-HQ与LSUN Church。