Existing blind image quality assessment (BIQA) methods focus on designing complicated networks based on convolutional neural networks (CNNs) or transformer. In addition, some BIQA methods enhance the performance of the model in a two-stage training manner. Despite the significant advancements, these methods remarkably raise the parameter count of the model, thus requiring more training time and computational resources. To tackle the above issues, we propose a lightweight parallel framework (LPF) for BIQA. First, we extract the visual features using a pre-trained feature extraction network. Furthermore, we construct a simple yet effective feature embedding network (FEN) to transform the visual features, aiming to generate the latent representations that contain salient distortion information. To improve the robustness of the latent representations, we present two novel self-supervised subtasks, including a sample-level category prediction task and a batch-level quality comparison task. The sample-level category prediction task is presented to help the model with coarse-grained distortion perception. The batch-level quality comparison task is formulated to enhance the training data and thus improve the robustness of the latent representations. Finally, the latent representations are fed into a distortion-aware quality regression network (DaQRN), which simulates the human vision system (HVS) and thus generates accurate quality scores. Experimental results on multiple benchmark datasets demonstrate that the proposed method achieves superior performance over state-of-the-art approaches. Moreover, extensive analyses prove that the proposed method has lower computational complexity and faster convergence speed.
翻译:现有盲图像质量评估(BIQA)方法主要集中于设计基于卷积神经网络(CNN)或Transformer的复杂网络结构。此外,部分BIQA方法采用两阶段训练策略来提升模型性能。尽管取得了显著进展,但这些方法显著增加了模型参数数量,从而需要更多的训练时间和计算资源。为解决上述问题,我们提出了一种面向BIQA的轻量级并行框架(LPF)。首先,利用预训练特征提取网络提取视觉特征;其次,构建简单而有效的特征嵌入网络(FEN)对视觉特征进行转换,旨在生成包含显著失真信息的潜在表征。为提升潜在表征的鲁棒性,我们提出了两种新型自监督子任务,包括样本级类别预测任务和批次级质量比较任务。样本级类别预测任务用于帮助模型实现粗粒度失真感知,批次级质量比较任务则通过增强训练数据来提升潜在表征的鲁棒性。最后,将潜在表征输入至失真感知质量回归网络(DaQRN),该网络通过模拟人类视觉系统(HVS)生成准确的质量分数。在多个基准数据集上的实验结果表明,所提方法性能优于现有先进方法。进一步的分析证明,该方法具有更低的计算复杂度和更快的收敛速度。