While developing perception based deep learning models, the benefit of synthetic data is enormous. However, performance of networks trained with synthetic data for certain computer vision tasks degrade significantly when tested on real world data due to the domain gap between them. One of the popular solutions in bridging this gap between synthetic and actual world data is to frame it as a domain adaptation task. In this paper, we propose and evaluate novel ways for the betterment of such approaches. In particular we build upon the method of UNIT-GAN. In normal GAN training for the task of domain translation, pairing of images from both the domains (viz, real and synthetic) is done randomly. We propose a novel method to efficiently incorporate semantic supervision into this pair selection, which helps in boosting the performance of the model along with improving the visual quality of such transformed images. We illustrate our empirical findings on Cityscapes \cite{cityscapes} and challenging synthetic dataset Synscapes. Though the findings are reported on the base network of UNIT-GAN, they can be easily extended to any other similar network.
翻译:在开发基于感知的深度学习模型时,合成数据的优势极为显著。然而,由于合成数据与真实世界数据之间存在领域差异,使用合成数据训练的某些计算机视觉任务网络在真实世界数据上测试时性能会大幅下降。弥合这一鸿沟的常用方法之一是将该问题建模为领域自适应任务。本文提出并评估了改进此类方法的新型途径,尤其是基于UNIT-GAN框架的改进。在常规用于领域转换的生成对抗网络训练中,两个领域(即真实领域与合成领域)的图像配对是随机进行的。我们提出了一种新方法,通过将语义监督高效整合到配对选择过程中,从而在提升模型性能的同时改善转换图像的视觉质量。我们在Cityscapes数据集及具有挑战性的合成数据集Synscapes上展示了实证结果。尽管研究结果基于UNIT-GAN基础网络报告,但所提方法可轻松扩展到其他相似网络。