Recent years have witnessed remarkable achievements in perceptual image restoration (IR), creating an urgent demand for accurate image quality assessment (IQA), which is essential for both performance comparison and algorithm optimization. Unfortunately, the existing IQA metrics exhibit inherent weakness for IR task, particularly when distinguishing fine-grained quality differences among restored images. To address this dilemma, we contribute the first-of-its-kind fine-grained image quality assessment dataset for image restoration, termed FGRestore, comprising 18,408 restored images across six common IR tasks. Beyond conventional scalar quality scores, FGRestore was also annotated with 30,886 fine-grained pairwise preferences. Based on FGRestore, a comprehensive benchmark was conducted on the existing IQA metrics, which reveal significant inconsistencies between score-based IQA evaluations and the fine-grained restoration quality. Motivated by these findings, we further propose FGResQ, a new IQA model specifically designed for image restoration, which features both coarse-grained score regression and fine-grained quality ranking. Extensive experiments and comparisons demonstrate that FGResQ significantly outperforms state-of-the-art IQA metrics. Codes and model weights have been released in https://sxfly99.github.io/FGResQ-Home.
翻译:近年来,感知图像恢复领域取得了显著进展,这使得对准确图像质量评估的需求日益迫切——图像质量评估对于性能比较和算法优化均至关重要。然而,现有图像质量评估指标在感知图像恢复任务中存在固有缺陷,尤其是在区分恢复图像间的细粒度质量差异方面。为解决这一难题,我们首次构建了面向图像恢复的细粒度图像质量评估数据集FGRestore,该数据集包含覆盖六种常见图像恢复任务的18,408张恢复图像。除了常规的标量质量分数外,FGRestore还标注了30,886组细粒度成对偏好。基于FGRestore,我们对现有图像质量评估指标进行了全面基准测试,揭示了基于分数的图像质量评估与细粒度恢复质量之间存在显著不一致性。受此发现启发,我们进一步提出FGResQ——一种专为图像恢复设计的新型图像质量评估模型,该模型兼具粗粒度分数回归与细粒度质量排序功能。大量实验与对比表明,FGResQ显著优于现有最先进的图像质量评估指标。代码与模型权重已在https://sxfly99.github.io/FGResQ-Home 开源。