Deep learning-based video quality assessment (deep VQA) has demonstrated significant potential in surpassing conventional metrics, with promising improvements in terms of correlation with human perception. However, the practical deployment of such deep VQA models is often limited due to their high computational complexity and large memory requirements. To address this issue, we aim to significantly reduce the model size and runtime of one of the state-of-the-art deep VQA methods, RankDVQA, by employing a two-phase workflow that integrates pruning-driven model compression with multi-level knowledge distillation. The resulting lightweight full reference quality metric, RankDVQA-mini, requires less than 10% of the model parameters compared to its full version (14% in terms of FLOPs), while still retaining a quality prediction performance that is superior to most existing deep VQA methods. The source code of the RankDVQA-mini has been released at https://chenfeng-bristol.github.io/RankDVQA-mini/ for public evaluation.
翻译:基于深度学习的视频质量评估(深度VQA)在超越传统指标方面展现出显著潜力,其在与人眼感知的相关性上取得了令人期待的改进。然而,此类深度VQA模型在实际部署中常受限于其高计算复杂度和大量内存需求。为解决此问题,我们旨在通过采用结合剪枝驱动模型压缩与多层级知识蒸馏的两阶段工作流,大幅缩减最先进的深度VQA方法之一——RankDVQA的模型尺寸与运行时间。由此产生的轻量级全参考质量指标RankDVQA-mini,其模型参数不足完整版的10%(FLOPs为完整版的14%),同时仍保持优于大多数现有深度VQA方法的质量预测性能。RankDVQA-mini的源代码已在https://chenfeng-bristol.github.io/RankDVQA-mini/ 上发布,供公众评估。