DP-IQA: Utilizing Diffusion Prior for Blind Image Quality Assessment in the Wild

Image quality assessment (IQA) plays a critical role in selecting high-quality images and guiding compression and enhancement methods in a series of applications. The blind IQA, which assesses the quality of in-the-wild images containing complex authentic distortions without reference images, poses greater challenges. Existing methods are limited to modeling a uniform distribution with local patches and are bothered by the gap between low and high-level visions (caused by widely adopted pre-trained classification networks). In this paper, we propose a novel IQA method called diffusion priors-based IQA (DP-IQA), which leverages the prior knowledge from the pre-trained diffusion model with its excellent powers to bridge semantic gaps in the perception of the visual quality of images. Specifically, we use pre-trained stable diffusion as the backbone, extract multi-level features from the denoising U-Net during the upsampling process at a specified timestep, and decode them to estimate the image quality score. The text and image adapters are adopted to mitigate the domain gap for downstream tasks and correct the information loss caused by the variational autoencoder bottleneck. Finally, we distill the knowledge in the above model into a CNN-based student model, significantly reducing the parameter to enhance applicability, with the student model performing similarly or even better than the teacher model surprisingly. Experimental results demonstrate that our DP-IQA achieves state-of-the-art results on various in-the-wild datasets with better generalization capability, which shows the superiority of our method in global modeling and utilizing the hierarchical feature clues of diffusion for evaluating image quality.

翻译：图像质量评估（IQA）在筛选高质量图像以及指导一系列应用中的压缩与增强方法方面起着关键作用。盲图像质量评估旨在评估包含复杂真实失真且无参考图像的真实场景图像质量，面临更大挑战。现有方法局限于对局部图像块进行均匀分布建模，并受限于由广泛采用的预训练分类网络引起的低层与高层视觉之间的语义鸿沟。本文提出一种新颖的IQA方法——基于扩散先验的图像质量评估（DP-IQA），该方法利用预训练扩散模型的先验知识及其卓越能力，弥合图像视觉质量感知中的语义差距。具体而言，我们以预训练的稳定扩散模型为骨干网络，在指定时间步长的上采样过程中从去噪U-Net提取多层级特征，并通过解码这些特征来估计图像质量分数。采用文本与图像适配器以缓解下游任务的领域差异，并校正由变分自编码器瓶颈造成的信息损失。最后，我们将上述模型中的知识蒸馏到基于CNN的学生模型中，显著减少参数量以提升适用性，而学生模型的表现竟与教师模型相当甚至更优。实验结果表明，我们的DP-IQA方法在多个真实场景数据集上取得了最先进的性能，并展现出更优的泛化能力，这证明了我们的方法在全局建模以及利用扩散模型的层次化特征线索评估图像质量方面的优越性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日