This work presents an effective depth-consistency self-prompt Transformer for image dehazing. It is motivated by an observation that the estimated depths of an image with haze residuals and its clear counterpart vary. Enforcing the depth consistency of dehazed images with clear ones, therefore, is essential for dehazing. For this purpose, we develop a prompt based on the features of depth differences between the hazy input images and corresponding clear counterparts that can guide dehazing models for better restoration. Specifically, we first apply deep features extracted from the input images to the depth difference features for generating the prompt that contains the haze residual information in the input. Then we propose a prompt embedding module that is designed to perceive the haze residuals, by linearly adding the prompt to the deep features. Further, we develop an effective prompt attention module to pay more attention to haze residuals for better removal. By incorporating the prompt, prompt embedding, and prompt attention into an encoder-decoder network based on VQGAN, we can achieve better perception quality. As the depths of clear images are not available at inference, and the dehazed images with one-time feed-forward execution may still contain a portion of haze residuals, we propose a new continuous self-prompt inference that can iteratively correct the dehazing model towards better haze-free image generation. Extensive experiments show that our method performs favorably against the state-of-the-art approaches on both synthetic and real-world datasets in terms of perception metrics including NIQE, PI, and PIQE.
翻译:本文提出了一种有效的深度一致性自提示Transformer用于图像去雾。其动机源于一个观察:带有雾霾残留的图像与清晰图像的估计深度存在差异。因此,强制去雾图像与清晰图像之间的深度一致性对于去雾至关重要。为此,我们基于雾霾输入图像与对应清晰图像之间的深度差异特征,开发了一种可引导去雾模型实现更好恢复的提示。具体而言,我们首先将从输入图像提取的深度特征应用于深度差异特征,以生成包含输入中雾霾残留信息的提示。然后,我们提出了一种提示嵌入模块,通过将提示线性添加到深度特征中,用于感知雾霾残留。此外,我们开发了一种有效的提示注意力模块,以更关注雾霾残留从而实现更好去除。通过将提示、提示嵌入和提示注意力融入基于VQGAN的编码器-解码器网络,我们能够获得更好的感知质量。由于推理时清晰图像深度不可用,且单次前馈执行后的去雾图像可能仍含有部分雾霾残留,我们提出了一种新的连续自提示推理方法,可迭代校正去雾模型以生成更好的无雾图像。大量实验表明,在NIQE、PI和PIQE等感知指标上,我们的方法在合成和真实世界数据集上均优于现有先进方法。