Image dehazing is a representative low-level vision task that estimates latent haze-free images from hazy images. In recent years, convolutional neural network-based methods have dominated image dehazing. However, vision Transformers, which has recently made a breakthrough in high-level vision tasks, has not brought new dimensions to image dehazing. We start with the popular Swin Transformer and find that several of its key designs are unsuitable for image dehazing. To this end, we propose DehazeFormer, which consists of various improvements, such as the modified normalization layer, activation function, and spatial information aggregation scheme. We train multiple variants of DehazeFormer on various datasets to demonstrate its effectiveness. Specifically, on the most frequently used SOTS indoor set, our small model outperforms FFA-Net with only 25% #Param and 5% computational cost. To the best of our knowledge, our large model is the first method with the PSNR over 40 dB on the SOTS indoor set, dramatically outperforming the previous state-of-the-art methods. We also collect a large-scale realistic remote sensing dehazing dataset for evaluating the method's capability to remove highly non-homogeneous haze.
翻译:图像脱色是一个具有代表性的低层次的视觉任务,它估计了隐蔽的无烟图像,来自烟雾图像。近年来,以革命性神经网络为基础的方法主导了图像脱色。然而,最近在高层次视觉任务中取得突破的视觉变异器并没有带来新的脱色功能。我们从流行的Swin变异器开始,发现它的一些关键设计不适合图像脱色。为此,我们提议DehazeFormer,它由各种改进组成,如经修改的正常化层、激活功能和空间信息汇总计划等。我们在各种数据集中培训了多种DehazeFormer变异体以展示其有效性。具体地说,在最常用的SOTS室内设置上,我们的小型模型比FA-Net高出了新的维度,只有25%的#Param和5%的计算成本。据我们所知,我们的大型模型是SNR在STS室内安装超过40 dB的首种方法,大大超出以往的状态方法。我们还收集了大规模遥感数据,以便评估高水平遥感方法。