Image dehazing poses significant challenges in environmental perception. Recent research mainly focus on deep learning-based methods with single modality, while they may result in severe information loss especially in dense-haze scenarios. The infrared image exhibits robustness to the haze, however, existing methods have primarily treated the infrared modality as auxiliary information, failing to fully explore its rich information in dehazing. To address this challenge, the key insight of this study is to design a visible-infrared fusion network for image dehazing. In particular, we propose a multi-scale Deep Structure Feature Extraction (DSFE) module, which incorporates the Channel-Pixel Attention Block (CPAB) to restore more spatial and marginal information within the deep structural features. Additionally, we introduce an inconsistency weighted fusion strategy to merge the two modalities by leveraging the more reliable information. To validate this, we construct a visible-infrared multimodal dataset called AirSim-VID based on the AirSim simulation platform. Extensive experiments performed on challenging real and simulated image datasets demonstrate that VIFNet can outperform many state-of-the-art competing methods. The code and dataset are available at https://github.com/mengyu212/VIFNet_dehazing.
翻译:图像去雾在环境感知中面临重大挑战。现有研究主要关注基于深度学习的单模态方法,但在浓雾场景下这些方法可能导致严重的信息损失。红外图像对雾霭具有鲁棒性,然而现有方法主要将红外模态作为辅助信息,未能充分挖掘其在去雾中的丰富信息。为解决该挑战,本研究的关键思路是设计一种可见光-红外融合网络用于图像去雾。具体而言,我们提出多尺度深层结构特征提取(DSFE)模块,该模块引入通道-像素注意力模块(CPAB)以恢复深层结构特征中更多空间与边缘信息。此外,我们提出不一致性加权融合策略,通过利用更可靠的信息融合两种模态。为验证该方法,我们基于AirSim仿真平台构建了名为AirSim-VID的可见光-红外多模态数据集。在具有挑战性的真实与仿真图像数据集上的大量实验表明,VIFNet能够超越多种当前最优的竞争方法。相关代码与数据集已公开于https://github.com/mengyu212/VIFNet_dehazing。