Recent advances in the understanding of Generative Adversarial Networks (GANs) have led to remarkable progress in visual editing and synthesis tasks, capitalizing on the rich semantics that are embedded in the latent spaces of pre-trained GANs. However, existing methods are often tailored to specific GAN architectures and are limited to either discovering global semantic directions that do not facilitate localized control, or require some form of supervision through manually provided regions or segmentation masks. In this light, we present an architecture-agnostic approach that jointly discovers factors representing spatial parts and their appearances in an entirely unsupervised fashion. These factors are obtained by applying a semi-nonnegative tensor factorization on the feature maps, which in turn enables context-aware local image editing with pixel-level control. In addition, we show that the discovered appearance factors correspond to saliency maps that localize concepts of interest, without using any labels. Experiments on a wide range of GAN architectures and datasets show that, in comparison to the state of the art, our method is far more efficient in terms of training time and, most importantly, provides much more accurate localized control. Our code is available at: https://github.com/james-oldfield/PandA.
翻译:生成对抗网络(GANs)理解方面的最新进展显著推动了视觉编辑与合成任务的发展,这得益于预训练GAN潜在空间中蕴含的丰富语义信息。然而,现有方法往往针对特定GAN架构设计,且局限于发现无法实现局部控制的全局语义方向,或需要通过人工标注区域或分割掩膜提供某种形式的监督。针对这一问题,我们提出一种架构无关的方法,能够以完全无监督的方式联合发现表征空间部件及其外观的因子。这些因子通过对特征图施加半非负张量分解获得,进而实现具有像素级控制的上下文感知局部图像编辑。此外,我们证明所发现的外观因子对应于能够定位感兴趣概念(无需任何标签)的显著性图。在多种GAN架构和数据集上的实验表明,与现有技术相比,本方法在训练时间上更为高效,最重要的是,能够提供更为精准的局部控制。我们的代码已公开于:https://github.com/james-oldfield/PandA。