Local Manifold Learning for No-Reference Image Quality Assessment

Contrastive learning has considerably advanced the field of Image Quality Assessment (IQA), emerging as a widely adopted technique. The core mechanism of contrastive learning involves minimizing the distance between quality-similar (positive) examples while maximizing the distance between quality-dissimilar (negative) examples. Despite its successes, current contrastive learning methods often neglect the importance of preserving the local manifold structure. This oversight can result in a high degree of similarity among hard examples within the feature space, thereby impeding effective differentiation and assessment. To address this issue, we propose an innovative framework that integrates local manifold learning with contrastive learning for No-Reference Image Quality Assessment (NR-IQA). Our method begins by sampling multiple crops from a given image, identifying the most visually salient crop. This crop is then used to cluster other crops from the same image as the positive class, while crops from different images are treated as negative classes to increase inter-class distance. Uniquely, our approach also considers non-saliency crops from the same image as intra-class negative classes to preserve their distinctiveness. Additionally, we employ a mutual learning framework, which further enhances the model's ability to adaptively learn and identify visual saliency regions. Our approach demonstrates a better performance compared to state-of-the-art methods in 7 standard datasets, achieving PLCC values of 0.942 (compared to 0.908 in TID2013) and 0.914 (compared to 0.894 in LIVEC).

翻译：对比学习显著推动了图像质量评估领域的发展，已成为一种广泛采用的技术。对比学习的核心机制在于最小化质量相似（正例）样本间的距离，同时最大化质量相异（负例）样本间的距离。尽管取得了成功，现有的对比学习方法往往忽略了保持局部流形结构的重要性。这种疏忽可能导致特征空间中困难样本间的高度相似性，从而阻碍有效的区分与评估。为解决这一问题，我们提出了一种创新框架，将局部流形学习与对比学习相结合，用于无参考图像质量评估。我们的方法首先从给定图像中采样多个图像块，识别出视觉显著性最高的图像块。随后，利用该图像块将来自同一图像的其他图像块聚类为正类，而将来自不同图像的图像块作为负类以增大类间距离。独特的是，我们的方法还将同一图像中的非显著性图像块视为类内负类，以保持其独特性。此外，我们采用了一种互学习框架，进一步增强了模型自适应学习和识别视觉显著性区域的能力。与现有先进方法相比，我们的方法在7个标准数据集上表现出更优的性能，在TID2013数据集上实现了0.942的PLCC值（对比基准为0.908），在LIVEC数据集上实现了0.914的PLCC值（对比基准为0.894）。