Pre-trained diffusion models utilized for image generation encapsulate a substantial reservoir of a priori knowledge pertaining to intricate textures. Harnessing the potential of leveraging this a priori knowledge in the context of image super-resolution presents a compelling avenue. Nonetheless, prevailing diffusion-based methodologies presently overlook the constraints imposed by degradation information on the diffusion process. Furthermore, these methods fail to consider the spatial variability inherent in the estimated blur kernel, stemming from factors such as motion jitter and out-of-focus elements in open-environment scenarios. This oversight results in a notable deviation of the image super-resolution effect from fundamental realities. To address these concerns, we introduce a framework known as Adaptive Multi-modal Fusion of \textbf{S}patially Variant Kernel Refinement with Diffusion Model for Blind Image \textbf{S}uper-\textbf{R}esolution (SSR). Within the SSR framework, we propose a Spatially Variant Kernel Refinement (SVKR) module. SVKR estimates a Depth-Informed Kernel, which takes the depth information into account and is spatially variant. Additionally, SVKR enhance the accuracy of depth information acquired from LR images, allowing for mutual enhancement between the depth map and blur kernel estimates. Finally, we introduce the Adaptive Multi-Modal Fusion (AMF) module to align the information from three modalities: low-resolution images, depth maps, and blur kernels. This alignment can constrain the diffusion model to generate more authentic SR results.
翻译:用于图像生成的预训练扩散模型蕴含了关于复杂纹理的大量先验知识。在图像超分辨率任务中利用这种先验知识具有重要价值。然而,现有的基于扩散的方法目前忽略了退化信息对扩散过程的约束作用。此外,这些方法未能考虑开放环境场景中(如运动抖动和离焦因素导致的)估计模糊核固有的空间变异性。这种疏忽会导致图像超分辨率效果与基本现实产生显著偏差。为解决这些问题,我们提出了一个名为"基于扩散模型与空间变异核精化的自适应多模态融合盲图像超分辨率"(SSR)的框架。在SSR框架中,我们提出了空间变异核精化(SVKR)模块。该模块估计一种考虑深度信息且具有空间变异性的深度感知核。同时,SVKR模块提升了从低分辨率图像获取的深度信息精度,实现了深度图与模糊核估计的相互增强。最后,我们引入自适应多模态融合(AMF)模块,用于对齐来自三个模态的信息:低分辨率图像、深度图和模糊核。这种对齐能够约束扩散模型以生成更真实的超分辨率结果。