Existing Blind image Super-Resolution (BSR) methods focus on estimating either kernel or degradation information, but have long overlooked the essential content details. In this paper, we propose a novel BSR approach, Content-aware Degradation-driven Transformer (CDFormer), to capture both degradation and content representations. However, low-resolution images cannot provide enough content details, and thus we introduce a diffusion-based module $CDFormer_{diff}$ to first learn Content Degradation Prior (CDP) in both low- and high-resolution images, and then approximate the real distribution given only low-resolution information. Moreover, we apply an adaptive SR network $CDFormer_{SR}$ that effectively utilizes CDP to refine features. Compared to previous diffusion-based SR methods, we treat the diffusion model as an estimator that can overcome the limitations of expensive sampling time and excessive diversity. Experiments show that CDFormer can outperform existing methods, establishing a new state-of-the-art performance on various benchmarks under blind settings. Codes and models will be available at \href{https://github.com/I2-Multimedia-Lab/CDFormer}{https://github.com/I2-Multimedia-Lab/CDFormer}.
翻译:现有的盲图像超分辨率方法主要聚焦于估计退化核或退化信息,却长期忽视了关键的内容细节。本文提出一种新颖的盲超分辨率方法——内容感知退化驱动Transformer(CDFormer),以同时捕获退化表征与内容表征。然而,低分辨率图像无法提供充足的内容细节,因此我们引入基于扩散的模块$CDFormer_{diff}$,首先在低分辨率与高分辨率图像中学习内容退化先验,进而仅依据低分辨率信息逼近真实分布。此外,我们采用自适应超分辨率网络$CDFormer_{SR}$,该网络能有效利用内容退化先验优化特征提取。相较于以往基于扩散的超分辨率方法,我们将扩散模型视为一种估计器,从而克服采样耗时过长与生成结果过度多样化的局限。实验表明,CDFormer在多项盲设置基准测试中均超越现有方法,建立了新的性能标杆。代码与模型将在\href{https://github.com/I2-Multimedia-Lab/CDFormer}{https://github.com/I2-Multimedia-Lab/CDFormer}公开。