Stable imaging in adverse environments (e.g., total darkness) makes thermal infrared (TIR) cameras a prevalent option for night scene perception. However, the low contrast and lack of chromaticity of TIR images are detrimental to human interpretation and subsequent deployment of RGB-based vision algorithms. Therefore, it makes sense to colorize the nighttime TIR images by translating them into the corresponding daytime color images (NTIR2DC). Despite the impressive progress made in the NTIR2DC task, how to improve the translation performance of small object classes is under-explored. To address this problem, we propose a generative adversarial network incorporating feedback-based object appearance learning (FoalGAN). Specifically, an occlusion-aware mixup module and corresponding appearance consistency loss are proposed to reduce the context dependence of object translation. As a representative example of small objects in nighttime street scenes, we illustrate how to enhance the realism of traffic light by designing a traffic light appearance loss. To further improve the appearance learning of small objects, we devise a dual feedback learning strategy to selectively adjust the learning frequency of different samples. In addition, we provide pixel-level annotation for a subset of the Brno dataset, which can facilitate the research of NTIR image understanding under multiple weather conditions. Extensive experiments illustrate that the proposed FoalGAN is not only effective for appearance learning of small objects, but also outperforms other image translation methods in terms of semantic preservation and edge consistency for the NTIR2DC task.
翻译:在恶劣环境(如完全黑暗)中稳定成像的能力,使热红外(TIR)相机成为夜间场景感知的常用选择。然而,TIR图像的低对比度和缺乏色度信息,不利于人类解读和后续基于RGB视觉算法的部署。因此,将夜间TIR图像转换为对应的日间彩色图像(NTIR2DC)具有实际意义。尽管NTIR2DC任务取得了显著进展,但如何提升小目标类别的翻译性能仍未得到充分探索。为解决这一问题,我们提出了一种融合基于反馈的对象外观学习的生成对抗网络(FoalGAN)。具体而言,我们设计了遮挡感知混合模块及相应的外观一致性损失,以减少对象翻译对上下文的依赖性。作为夜间街景中小目标的典型示例,我们通过设计交通灯外观损失,展示了如何增强交通灯的逼真度。为进一步提升小目标的外观学习效果,我们设计了一种双反馈学习策略,以选择性调整不同样本的学习频率。此外,我们为Brno数据集的一个子集提供了像素级标注,这有助于推动多种天气条件下NTIR图像理解的研究。大量实验表明,所提出的FoalGAN不仅在小目标外观学习上有效,而且在NTIR2DC任务的语义保持和边缘一致性方面优于其他图像翻译方法。