Infrared-visible image fusion aims to integrate infrared and visible information into a single fused image. Existing 2D fusion methods focus on fusing images from fixed camera viewpoints, neglecting a comprehensive understanding of complex scenarios, which results in the loss of critical information about the scene. To address this limitation, we propose a novel Infrared-Visible Gaussian Fusion (IVGF) framework, which reconstructs scene geometry from multimodal 2D inputs and enables direct rendering of fused images. Specifically, we propose a cross-modal adjustment (CMA) module that modulates the opacity of Gaussians to solve the problem of cross-modal conflicts. Moreover, to preserve the distinctive features from both modalities, we introduce a fusion loss that guides the optimization of CMA, thus ensuring that the fused image retains the critical characteristics of each modality. Comprehensive qualitative and quantitative experiments demonstrate the effectiveness of the proposed method.
翻译:红外-可见光图像融合旨在将红外与可见光信息整合至单一融合图像中。现有二维融合方法主要聚焦于固定相机视点的图像融合,缺乏对复杂场景的全面理解,导致场景关键信息丢失。为克服此局限,本文提出一种新颖的红外-可见光高斯融合框架,该框架可从多模态二维输入重建场景几何,并实现融合图像的直接渲染。具体而言,我们提出跨模态调整模块,通过调制高斯体的不透明度以解决跨模态冲突问题。此外,为保留双模态的独特特征,我们引入融合损失函数以指导跨模态调整模块的优化,从而确保融合图像保留各模态的关键特性。综合定性与定量实验验证了所提方法的有效性。