Generative Adversarial Networks (GANs) are powerful models able to synthesize data samples closely resembling the distribution of real data, yet the diversity of those generated samples is limited due to the so-called mode collapse phenomenon observed in GANs. Especially prone to mode collapse are conditional GANs, which tend to ignore the input noise vector and focus on the conditional information. Recent methods proposed to mitigate this limitation increase the diversity of generated samples, yet they reduce the performance of the models when similarity of samples is required. To address this shortcoming, we propose a novel method to selectively increase the diversity of GAN-generated samples. By adding a simple, yet effective regularization to the training loss function we encourage the generator to discover new data modes for inputs related to diverse outputs while generating consistent samples for the remaining ones. More precisely, we maximise the ratio of distances between generated images and input latent vectors scaling the effect according to the diversity of samples for a given conditional input. We show the superiority of our method in a synthetic benchmark as well as a real-life scenario of simulating data from the Zero Degree Calorimeter of ALICE experiment in LHC, CERN.
翻译:生成对抗网络(GANs)是能够合成与真实数据分布高度相似的数据样本的强大模型,但由于GANs中出现的所谓模式坍塌现象,这些生成样本的多样性受到限制。条件GANs尤其容易发生模式坍塌,这类模型往往忽略输入噪声向量,而聚焦于条件信息。近期提出的缓解该限制的方法虽然增加了生成样本的多样性,但在需要样本相似性的任务中降低了模型性能。为解决这一缺陷,我们提出了一种新颖的方法来选择性增加GAN生成样本的多样性。通过在训练损失函数中添加一个简单而有效的正则化项,我们鼓励生成器为与多样化输出相关的输入发现新的数据模式,同时为其余输入生成一致的样本。具体而言,我们最大化生成图像与输入潜在向量之间的距离比值,并根据给定条件输入下样本的多样性缩放该效应。我们在合成基准测试以及真实场景——欧洲核子研究中心大型强子对撞机ALICE实验中零度量热器的数据模拟——中展示了我们方法的优越性。