SelfPromer: Self-Prompt Dehazing Transformers with Depth-Consistency

This work presents an effective depth-consistency self-prompt Transformer for image dehazing. It is motivated by an observation that the estimated depths of an image with haze residuals and its clear counterpart vary. Enforcing the depth consistency of dehazed images with clear ones, therefore, is essential for dehazing. For this purpose, we develop a prompt based on the features of depth differences between the hazy input images and corresponding clear counterparts that can guide dehazing models for better restoration. Specifically, we first apply deep features extracted from the input images to the depth difference features for generating the prompt that contains the haze residual information in the input. Then we propose a prompt embedding module that is designed to perceive the haze residuals, by linearly adding the prompt to the deep features. Further, we develop an effective prompt attention module to pay more attention to haze residuals for better removal. By incorporating the prompt, prompt embedding, and prompt attention into an encoder-decoder network based on VQGAN, we can achieve better perception quality. As the depths of clear images are not available at inference, and the dehazed images with one-time feed-forward execution may still contain a portion of haze residuals, we propose a new continuous self-prompt inference that can iteratively correct the dehazing model towards better haze-free image generation. Extensive experiments show that our method performs favorably against the state-of-the-art approaches on both synthetic and real-world datasets in terms of perception metrics including NIQE, PI, and PIQE.

翻译：本文提出了一种有效的深度一致性自提示Transformer用于图像去雾。其动机源于一个观察：含有雾霾残留的图像与其清晰版本之间的估计深度存在差异。因此，强制去雾图像与清晰图像的深度一致性对于去雾至关重要。为此，我们基于含雾输入图像与对应清晰图像之间的深度差异特征开发了一种提示，该提示可引导去雾模型实现更好的恢复。具体而言，我们首先将输入图像提取的深度特征应用于深度差异特征，以生成包含输入中雾霾残留信息的提示。然后提出一个提示嵌入模块，通过将提示线性添加至深度特征来感知雾霾残留。此外，我们开发了有效的提示注意力模块，以便更关注雾霾残留从而实现更优去除。通过将提示、提示嵌入和提示注意力整合至基于VQGAN的编码器-解码器网络中，我们能够获得更好的感知质量。由于推理时无法获取清晰图像的深度，且单次前馈执行后的去雾图像仍可能包含部分雾霾残留，我们提出了一种新的连续自提示推理方法，该方法可迭代修正去雾模型以生成更优的无雾图像。大量实验表明，在包括NIQE、PI和PIQE在内的感知指标上，我们的方法在合成和真实数据集上均优于现有最先进方法。