In the literature, several studies have shown that state-of-the-art image similarity metrics are not perceptual metrics; moreover, they have difficulty evaluating images, especially when texture distortion is also present. In this work, we propose a new perceptual metric composed of two terms. The first term evaluates the dissimilarity between the textures of two images using Earth Mover's Distance. The second term evaluates the chromatic dissimilarity between two images in the Oklab perceptual color space. We evaluated the performance of our metric on a non-traditional dataset, called Berkeley-Adobe Perceptual Patch Similarity, which contains a wide range of complex distortions in shapes and colors. We have shown that our metric outperforms the state of the art, especially when images contain shape distortions, confirming also its greater perceptiveness. Furthermore, although deep black-box metrics could be very accurate, they only provide similarity scores between two images, without explaining their main differences and similarities. Our metric, on the other hand, provides visual explanations to support the calculated score, making the similarity assessment transparent and justified.
翻译:文献研究表明,现有最先进的图像相似度度量方法并非感知度量;此外,它们在评估图像时存在困难,特别是在同时存在纹理失真的情况下。本文提出了一种由两项组成的新型感知度量方法:第一项使用推土机距离评估两幅图像纹理间的差异度;第二项在Oklab感知色彩空间中评估两幅图像的色彩差异度。我们在名为Berkeley-Adobe感知块相似度的非传统数据集上评估了所提度量的性能,该数据集包含形状与色彩方面广泛而复杂的失真类型。实验表明,所提度量方法尤其在图像存在形状失真时优于现有最优方法,同时验证了其更强的感知能力。此外,尽管深度黑盒度量方法可能具有较高准确性,但它们仅能提供两幅图像间的相似度分数,而无法解释其主要差异与相似性。相比之下,本文提出的度量方法能够提供可视化解释以支撑计算所得分数,从而使相似度评估过程具备透明性与可解释性。