Training Generative adversarial networks (GANs) stably is a challenging task. The generator in GANs transform noise vectors, typically Gaussian distributed, into realistic data such as images. In this paper, we propose a novel approach for training GANs with images as inputs, but without enforcing any pairwise constraints. The intuition is that images are more structured than noise, which the generator can leverage to learn a more robust transformation. The process can be made efficient by identifying closely related datasets, or a ``friendly neighborhood'' of the target distribution, inspiring the moniker, Spider GAN. To define friendly neighborhoods leveraging proximity between datasets, we propose a new measure called the signed inception distance (SID), inspired by the polyharmonic kernel. We show that the Spider GAN formulation results in faster convergence, as the generator can discover correspondence even between seemingly unrelated datasets, for instance, between Tiny-ImageNet and CelebA faces. Further, we demonstrate cascading Spider GAN, where the output distribution from a pre-trained GAN generator is used as the input to the subsequent network. Effectively, transporting one distribution to another in a cascaded fashion until the target is learnt -- a new flavor of transfer learning. We demonstrate the efficacy of the Spider approach on DCGAN, conditional GAN, PGGAN, StyleGAN2 and StyleGAN3. The proposed approach achieves state-of-the-art Frechet inception distance (FID) values, with one-fifth of the training iterations, in comparison to their baseline counterparts on high-resolution small datasets such as MetFaces, Ukiyo-E Faces and AFHQ-Cats.
翻译:稳定训练生成对抗网络(GANs)是一项具有挑战性的任务。GAN中的生成器将噪声向量(通常服从高斯分布)转换为逼真的数据(如图像)。本文提出一种以图像为输入训练GAN的新方法,且无需施加任何成对约束。其核心思想在于:图像相较于噪声具有更丰富的结构特征,生成器可利用这一特性学习更鲁棒的变换。通过识别密切相关数据集或目标分布的"友好邻域",可使该过程更高效——这也是"蜘蛛GAN"名称的由来。为利用数据集间邻近性定义友好邻域,我们提出一种受多调和核启发的新指标:签名初始距离(SID)。研究表明,蜘蛛GAN框架能实现更快收敛,因为生成器甚至能在看似无关的数据集(如Tiny-ImageNet与CelebA人脸)间发现对应关系。此外,我们展示了级联蜘蛛GAN:将预训练GAN生成器的输出分布作为后续网络的输入。这种以级联方式逐步迁移分布直至学习到目标分布的过程,开创了迁移学习的新范式。我们在DCGAN、条件GAN、PGGAN、StyleGAN2和StyleGAN3上验证了蜘蛛方法的有效性。在MetFaces、Ukiyo-E Faces和AFHQ-Cats等高分辨率小数据集上,该方法仅需基准模型五分之一训练迭代次数,即可达到最先进的弗雷歇初始距离(FID)值。