Although many effective models and real-world datasets have been presented for blind image quality assessment (BIQA), recent BIQA models usually tend to fit specific training set. Hence, it is still difficult to accurately and robustly measure the visual quality of an arbitrary real-world image. In this paper, a robust BIQA method, is designed based on three aspects, i.e., robust training strategy, large-scale real-world dataset, and powerful backbone. First, many individual models based on popular and state-of-the-art (SOTA) Swin-Transformer (SwinT) are trained on different real-world BIQA datasets respectively. Then, these biased SwinT-based models are jointly used to generate pseudo-labels, which adopts the probability of relative quality of two random images instead of fixed quality score. A large-scale real-world image dataset with 1,000,000 image pairs and pseudo-labels is then proposed for training the final cross-dataset-robust model. Experimental results on cross-dataset tests show that the performance of the proposed method is even better than some SOTA methods that are directly trained on these datasets, thus verifying the robustness and generalization of our method.
翻译:尽管针对盲图像质量评估(BIQA)已提出了许多有效模型和真实世界数据集,但现有的BIQA模型通常倾向于拟合特定训练集。因此,准确且鲁棒地测量任意真实世界图像的视觉质量仍然困难。本文从三个层面设计了一种鲁棒的BIQA方法,即鲁棒训练策略、大规模真实世界数据集以及强大的骨干网络。首先,基于流行且先进的Swin-Transformer(SwinT)分别在不同真实世界BIQA数据集上训练多个独立模型。随后,这些存在偏差的SwinT模型被联合用于生成伪标签,该标签采用两幅随机图像相对质量概率而非固定质量分数。接着构建包含1,000,000对图像及伪标签的大规模真实世界图像数据集,用于训练最终具有跨数据集鲁棒性的模型。跨数据集测试的实验结果表明,所提方法的性能甚至优于部分直接在这些数据集上训练的先进方法,从而验证了本方法的鲁棒性与泛化能力。