Integrating visible and infrared images into one high-quality image, also known as visible and infrared image fusion, is a challenging yet critical task for many downstream vision tasks. Most existing works utilize pretrained deep neural networks or design sophisticated frameworks with strong priors for this task, which may be unsuitable or lack flexibility. This paper presents SimpleFusion, a simple yet effective framework for visible and infrared image fusion. Our framework follows the decompose-and-fusion paradigm, where the visible and the infrared images are decomposed into reflectance and illumination components via Retinex theory and followed by the fusion of these corresponding elements. The whole framework is designed with two plain convolutional neural networks without downsampling, which can perform image decomposition and fusion efficiently. Moreover, we introduce decomposition loss and a detail-to-semantic loss to preserve the complementary information between the two modalities for fusion. We conduct extensive experiments on the challenging benchmarks, verifying the superiority of our method over previous state-of-the-arts. Code is available at \href{https://github.com/hxwxss/SimpleFusion-A-Simple-Fusion-Framework-for-Infrared-and-Visible-Images}{https://github.com/hxwxss/SimpleFusion-A-Simple-Fusion-Framework-for-Infrared-and-Visible-Images}
翻译:将可见光与红外图像整合为一张高质量图像,即可见光与红外图像融合,是许多下游视觉任务中一项具有挑战性且至关重要的任务。现有方法大多利用预训练的深度神经网络或设计具有强先验的复杂框架来完成此任务,这些方法可能不适用或缺乏灵活性。本文提出SimpleFusion,一个简单而有效的可见光与红外图像融合框架。我们的框架遵循“分解-融合”范式,通过Retinex理论将可见光与红外图像分解为反射分量与光照分量,随后融合这些对应成分。整个框架由两个无下采样的普通卷积神经网络构成,能够高效执行图像分解与融合。此外,我们引入了分解损失与细节到语义损失,以保留两种模态间的互补信息用于融合。我们在多个具有挑战性的基准数据集上进行了广泛实验,验证了本方法相较于以往先进技术的优越性。代码发布于 \href{https://github.com/hxwxss/SimpleFusion-A-Simple-Fusion-Framework-for-Infrared-and-Visible-Images}{https://github.com/hxwxss/SimpleFusion-A-Simple-Fusion-Framework-for-Infrared-and-Visible-Images}。