DP-IQA: Utilizing Diffusion Prior for Blind Image Quality Assessment in the Wild

Image quality assessment (IQA) plays a critical role in selecting high-quality images and guiding compression and enhancement methods in a series of applications. The blind IQA, which assesses the quality of in-the-wild images containing complex authentic distortions without reference images, poses greater challenges. Existing methods are limited to modeling a uniform distribution with local patches and are bothered by the gap between low and high-level visions (caused by widely adopted pre-trained classification networks). In this paper, we propose a novel IQA method called diffusion priors-based IQA (DP-IQA), which leverages the prior knowledge from the pre-trained diffusion model with its excellent powers to bridge semantic gaps in the perception of the visual quality of images. Specifically, we use pre-trained stable diffusion as the backbone, extract multi-level features from the denoising U-Net during the upsampling process at a specified timestep, and decode them to estimate the image quality score. The text and image adapters are adopted to mitigate the domain gap for downstream tasks and correct the information loss caused by the variational autoencoder bottleneck. Finally, we distill the knowledge in the above model into a CNN-based student model, significantly reducing the parameter to enhance applicability, with the student model performing similarly or even better than the teacher model surprisingly. Experimental results demonstrate that our DP-IQA achieves state-of-the-art results on various in-the-wild datasets with better generalization capability, which shows the superiority of our method in global modeling and utilizing the hierarchical feature clues of diffusion for evaluating image quality.

翻译：图像质量评估（IQA）在一系列应用中对于筛选高质量图像以及指导压缩和增强方法起着至关重要的作用。盲图像质量评估旨在评估包含复杂真实失真、且无参考图像的真实场景图像的质量，这带来了更大的挑战。现有方法局限于对局部图像块进行均匀分布建模，并受到由广泛采用的预训练分类网络所导致的低层与高层视觉之间差距的困扰。本文提出了一种新颖的IQA方法，称为基于扩散先验的IQA（DP-IQA），该方法利用预训练扩散模型的先验知识及其卓越能力，以弥合图像视觉质量感知中的语义鸿沟。具体而言，我们使用预训练的稳定扩散模型作为主干网络，在指定时间步长的上采样过程中从去噪U-Net中提取多层次特征，并通过解码这些特征来估计图像质量分数。我们采用了文本和图像适配器来缓解下游任务的领域差距，并纠正由变分自编码器瓶颈造成的信息损失。最后，我们将上述模型中的知识蒸馏到一个基于CNN的学生模型中，显著减少了参数量以增强适用性，而该学生模型的表现出人意料地与教师模型相当甚至更优。实验结果表明，我们的DP-IQA在多个真实场景数据集上取得了最先进的结果，并具有更好的泛化能力，这证明了我们的方法在全局建模以及利用扩散的层次化特征线索评估图像质量方面的优越性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

14+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日