SCONE-GAN presents an end-to-end image translation, which is shown to be effective for learning to generate realistic and diverse scenery images. Most current image-to-image translation approaches are devised as two mappings: a translation from the source to target domain and another to represent its inverse. While successful in many applications, these approaches may suffer from generating trivial solutions with limited diversity. That is because these methods learn more frequent associations rather than the scene structures. To mitigate the problem, we propose SCONE-GAN that utilises graph convolutional networks to learn the objects dependencies, maintain the image structure and preserve its semantics while transferring images into the target domain. For more realistic and diverse image generation we introduce style reference image. We enforce the model to maximize the mutual information between the style image and output. The proposed method explicitly maximizes the mutual information between the related patches, thus encouraging the generator to produce more diverse images. We validate the proposed algorithm for image-to-image translation and stylizing outdoor images. Both qualitative and quantitative results demonstrate the effectiveness of our approach on four dataset.
翻译:SCONE-GAN提出了一种端到端的图像翻译方法,该方法在学习生成逼真且多样化的场景图像方面展现出有效性。目前大多数图像到图像翻译方法采用两种映射机制:一种是从源域到目标域的转换,另一种是描述其逆过程。尽管这些方法在许多应用中取得了成功,但可能因生成多样性有限的平凡解而受到局限。这是因为这些方法倾向于学习更频繁出现的关联特征而非场景结构。为解决该问题,我们提出SCONE-GAN,利用图卷积网络学习对象间的依赖关系,在将图像迁移至目标域的同时,维持图像结构并保留语义信息。为生成更逼真且多样化的图像,我们引入了风格参考图像,并强制模型最大化风格图像与输出之间的互信息。该方法显式地最大化相关图像块之间的互信息,从而激励生成器产生更多样化的图像。我们在图像到图像翻译及户外图像风格化任务上验证了所提算法。定性和定量结果均证明了该方法在四个数据集上的有效性。