Haze usually leads to deteriorated images with low contrast, color shift and structural distortion. We observe that many deep learning based models exhibit exceptional performance on removing homogeneous haze, but they usually fail to address the challenge of non-homogeneous dehazing. Two main factors account for this situation. Firstly, due to the intricate and non uniform distribution of dense haze, the recovery of structural and chromatic features with high fidelity is challenging, particularly in regions with heavy haze. Secondly, the existing small scale datasets for non-homogeneous dehazing are inadequate to support reliable learning of feature mappings between hazy images and their corresponding haze-free counterparts by convolutional neural network (CNN)-based models. To tackle these two challenges, we propose a novel two branch network that leverages 2D discrete wavelete transform (DWT), fast Fourier convolution (FFC) residual block and a pretrained ConvNeXt model. Specifically, in the DWT-FFC frequency branch, our model exploits DWT to capture more high-frequency features. Moreover, by taking advantage of the large receptive field provided by FFC residual blocks, our model is able to effectively explore global contextual information and produce images with better perceptual quality. In the prior knowledge branch, an ImageNet pretrained ConvNeXt as opposed to Res2Net is adopted. This enables our model to learn more supplementary information and acquire a stronger generalization ability. The feasibility and effectiveness of the proposed method is demonstrated via extensive experiments and ablation studies. The code is available at https://github.com/zhouh115/DWT-FFC.
翻译:雾霾通常会导致图像对比度降低、色彩偏移和结构失真。我们观察到,许多基于深度学习的模型在去除均匀雾霾方面表现出卓越性能,但在处理非均匀去雾挑战时往往失败。造成这一情况主要有两个因素。首先,由于浓密雾霾的复杂且非均匀分布,以高保真度恢复结构和色彩特征具有挑战性,尤其是在雾霾严重的区域。其次,现有的非均匀去雾小规模数据集不足以支持基于卷积神经网络(CNN)的模型可靠地学习有雾图像与对应无雾图像之间的特征映射。为解决这两个挑战,我们提出了一种新颖的双分支网络,该网络利用了二维离散小波变换(DWT)、快速傅里叶卷积(FFC)残差块和预训练的ConvNeXt模型。具体而言,在DWT-FFC频率分支中,我们的模型利用DWT捕获更多高频特征。此外,通过利用FFC残差块提供的大感受野,我们的模型能够有效探索全局上下文信息,并生成具有更好感知质量的图像。在先验知识分支中,我们采用了ImageNet预训练的ConvNeXt而非Res2Net。这使得我们的模型能够学习更多补充信息,并获得更强的泛化能力。通过大量实验和消融研究,验证了所提方法的可行性和有效性。代码可在https://github.com/zhouh115/DWT-FFC获取。