Image super-resolution (SR) methods typically model degradation to improve reconstruction accuracy in complex and unknown degradation scenarios. However, extracting degradation information from low-resolution images is challenging, which limits the model performance. To boost image SR performance, one feasible approach is to introduce additional priors. Inspired by advancements in multi-modal methods and text prompt image processing, we introduce text prompts to image SR to provide degradation priors. Specifically, we first design a text-image generation pipeline to integrate text into SR dataset through the text degradation representation and degradation model. The text representation applies a discretization manner based on the binning method to describe the degradation abstractly. This representation method can also maintain the flexibility of language. Meanwhile, we propose the PromptSR to realize the text prompt SR. The PromptSR employs the diffusion model and the pre-trained language model (e.g., T5 and CLIP). We train the model on the generated text-image dataset. Extensive experiments indicate that introducing text prompts into image SR, yields excellent results on both synthetic and real-world images. Code: https://github.com/zhengchen1999/PromptSR.
翻译:图像超分辨率方法通常通过建模退化过程来提升在复杂未知退化场景下的重建精度。然而,从低分辨率图像中提取退化信息具有挑战性,这限制了模型性能。为提升图像超分辨率性能,引入额外先验是一种可行方案。受多模态方法和文本提示图像处理进展的启发,我们将文本提示引入图像超分辨率以提供退化先验。具体而言,我们首先设计了一个文本-图像生成流水线,通过文本退化表示和退化模型将文本集成到超分辨率数据集中。该文本表示采用基于分箱方法的离散化方式抽象描述退化过程,同时保持语言的灵活性。在此基础上,我们提出了PromptSR以实现文本提示超分辨率。PromptSR采用扩散模型与预训练语言模型(如T5和CLIP)。我们在生成的文本-图像数据集上训练模型。大量实验表明,将文本提示引入图像超分辨率,在合成图像和真实图像上均取得了优异结果。代码地址:https://github.com/zhengchen1999/PromptSR。