With the proliferation of image-based applications in various domains, the need for accurate and interpretable image similarity measures has become increasingly critical. Existing image similarity models often lack transparency, making it challenging to understand the reasons why two images are considered similar. In this paper, we propose the concept of explainable image similarity, where the goal is the development of an approach, which is capable of providing similarity scores along with visual factual and counterfactual explanations. Along this line, we present a new framework, which integrates Siamese Networks and Grad-CAM for providing explainable image similarity and discuss the potential benefits and challenges of adopting this approach. In addition, we provide a comprehensive discussion about factual and counterfactual explanations provided by the proposed framework for assisting decision making. The proposed approach has the potential to enhance the interpretability, trustworthiness and user acceptance of image-based systems in real-world image similarity applications. The implementation code can be found in https://github.com/ioannislivieris/Grad_CAM_Siamese.git.
翻译:随着图像应用在各领域的广泛普及,准确且可解释的图像相似度度量需求日益凸显。现有图像相似度模型往往缺乏透明度,难以理解两幅图像被视为相似的内在原因。本文提出可解释图像相似度的概念,旨在开发一种能够同时提供相似度分数及可视化事实性与反事实性解释的方法。基于此思路,我们提出一个融合孪生网络与Grad-CAM的新框架以提供可解释图像相似度,并探讨采用该方法的潜在优势与挑战。此外,我们对该框架提供的事实性与反事实性解释进行系统性论述,以辅助决策过程。所提方法有望提升真实图像相似度应用中系统的可解释性、可信度与用户接受度。实现代码见 https://github.com/ioannislivieris/Grad_CAM_Siamese.git。