Ensuring the realism of computer-generated synthetic images is crucial to deep neural network (DNN) training. Due to different semantic distributions between synthetic and real-world captured datasets, there exists semantic mismatch between synthetic and refined images, which in turn results in the semantic distortion. Recently, contrastive learning (CL) has been successfully used to pull correlated patches together and push uncorrelated ones apart. In this work, we exploit semantic and structural consistency between synthetic and refined images and adopt CL to reduce the semantic distortion. Besides, we incorporate hard negative mining to improve the performance furthermore. We compare the performance of our method with several other benchmarking methods using qualitative and quantitative measures and show that our method offers the state-of-the-art performance.
翻译:确保计算机生成合成图像的真实感对深度神经网络训练至关重要。由于合成数据集与真实世界捕获数据集之间存在语义分布差异,合成图像与精炼图像间会产生语义失配,进而导致语义失真。近年来,对比学习已成功用于将相关图像块拉近、不相关图像块推远。本研究利用合成与精炼图像间的语义和结构一致性,采用对比学习降低语义失真。此外,我们引入难负样本挖掘以进一步提升性能。通过定性与定量评估,我们将本方法与多种基准方法进行性能比较,结果表明本方法达到了最先进的性能水平。