We introduce a novel cross-reference image quality assessment method that effectively fills the gap in the image assessment landscape, complementing the array of established evaluation schemes -- ranging from full-reference metrics like SSIM, no-reference metrics such as NIQE, to general-reference metrics including FID, and Multi-modal-reference metrics, e.g., CLIPScore. Utilising a neural network with the cross-attention mechanism and a unique data collection pipeline from NVS optimisation, our method enables accurate image quality assessment without requiring ground truth references. By comparing a query image against multiple views of the same scene, our method addresses the limitations of existing metrics in novel view synthesis (NVS) and similar tasks where direct reference images are unavailable. Experimental results show that our method is closely correlated to the full-reference metric SSIM, while not requiring ground truth references.
翻译:我们提出了一种新颖的交叉参考图像质量评估方法,该方法有效填补了图像评估领域的空白,补充了既有的评估体系——涵盖全参考度量(如SSIM)、无参考度量(如NIQE)、通用参考度量(包括FID)以及多模态参考度量(例如CLIPScore)。通过利用具有交叉注意力机制的神经网络以及从NVS优化中构建的独特数据收集流程,我们的方法无需真实参考图像即可实现精确的图像质量评估。通过将查询图像与同一场景的多视角图像进行比较,本方法克服了现有度量在新视角合成(NVS)等任务中因缺乏直接参考图像而存在的局限性。实验结果表明,本方法与全参考度量SSIM高度相关,且无需真实参考图像。