Perception- and Fidelity-aware Reduced-Reference Super-Resolution Image Quality Assessment

from arxiv, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

With the advent of image super-resolution (SR) algorithms, how to evaluate the quality of generated SR images has become an urgent task. Although full-reference methods perform well in SR image quality assessment (SR-IQA), their reliance on high-resolution (HR) images limits their practical applicability. Leveraging available reconstruction information as much as possible for SR-IQA, such as low-resolution (LR) images and the scale factors, is a promising way to enhance assessment performance for SR-IQA without HR for reference. In this letter, we attempt to evaluate the perceptual quality and reconstruction fidelity of SR images considering LR images and scale factors. Specifically, we propose a novel dual-branch reduced-reference SR-IQA network, \ie, Perception- and Fidelity-aware SR-IQA (PFIQA). The perception-aware branch evaluates the perceptual quality of SR images by leveraging the merits of global modeling of Vision Transformer (ViT) and local relation of ResNet, and incorporating the scale factor to enable comprehensive visual perception. Meanwhile, the fidelity-aware branch assesses the reconstruction fidelity between LR and SR images through their visual perception. The combination of the two branches substantially aligns with the human visual system, enabling a comprehensive SR image evaluation. Experimental results indicate that our PFIQA outperforms current state-of-the-art models across three widely-used SR-IQA benchmarks. Notably, PFIQA excels in assessing the quality of real-world SR images.

翻译：随着图像超分辨率（SR）算法的涌现，如何评估生成SR图像的质量已成为一项迫切任务。尽管全参考方法在SR图像质量评估（SR-IQA）中表现优异，但其对高分辨率（HR）图像的依赖限制了实际应用。充分利用可用重建信息（如低分辨率（LR）图像和缩放因子）进行SR-IQA，是在无HR参考情况下提升SR-IQA评估性能的一种有前景途径。本文尝试结合LR图像和缩放因子评估SR图像的感知质量与重建保真度。具体而言，我们提出一种新颖的双分支减参考SR-IQA网络，即感知与保真度感知的SR-IQA（PFIQA）。感知分支通过融合Vision Transformer（ViT）全局建模与ResNet局部关联的优势，并引入缩放因子以实现全面的视觉感知，从而评估SR图像的感知质量。同时，保真度分支通过视觉感知评估LR和SR图像之间的重建保真度。两分支的组合显著契合人类视觉系统，实现了对SR图像的全面评估。实验结果表明，我们的PFIQA在三个广泛使用的SR-IQA基准上优于当前最先进模型。值得注意的是，PFIQA在真实世界SR图像质量评估中表现尤为出色。