Recent advances in the field of generative models and in particular generative adversarial networks (GANs) have lead to substantial progress for controlled image editing, especially compared with the pre-deep learning era. Despite their powerful ability to apply realistic modifications to an image, these methods often lack properties like disentanglement (the capacity to edit attributes independently). In this paper, we propose an auto-encoder which re-organizes the latent space of StyleGAN, so that each attribute which we wish to edit corresponds to an axis of the new latent space, and furthermore that the latent axes are decorrelated, encouraging disentanglement. We work in a compressed version of the latent space, using Principal Component Analysis, meaning that the parameter complexity of our autoencoder is reduced, leading to short training times ($\sim$ 45 mins). Qualitative and quantitative results demonstrate the editing capabilities of our approach, with greater disentanglement than competing methods, while maintaining fidelity to the original image with respect to identity. Our autoencoder architecture simple and straightforward, facilitating implementation.
翻译:近年来,生成模型领域特别是生成对抗网络(GANs)的进展,使得受控图像编辑取得了显著进步,尤其是与深度学习前时代相比。尽管这些方法能对图像进行逼真的修改,但往往缺乏解耦特性(即独立编辑属性的能力)。本文提出了一种自编码器,该编码器重新组织StyleGAN的隐空间,使每个待编辑属性对应于新隐空间的一个轴,同时这些隐空间轴去相关化以促进解耦。我们利用主成分分析在隐空间的压缩版本上进行操作,从而降低了自编码器的参数复杂度,实现较短的训练时间(约45分钟)。定性和定量结果展示了本方法的编辑能力:相比竞争方法具有更强的解耦性,同时在保持原始图像身份一致性方面表现优异。我们的自编码器架构简单直观,便于实现。