Recently, numerous approaches have achieved notable success in compressed video quality enhancement (VQE). However, these methods usually ignore the utilization of valuable coding priors inherently embedded in compressed videos, such as motion vectors and residual frames, which carry abundant temporal and spatial information. To remedy this problem, we propose the Coding Priors-Guided Aggregation (CPGA) network to utilize temporal and spatial information from coding priors. The CPGA mainly consists of an inter-frame temporal aggregation (ITA) module and a multi-scale non-local aggregation (MNA) module. Specifically, the ITA module aggregates temporal information from consecutive frames and coding priors, while the MNA module globally captures spatial information guided by residual frames. In addition, to facilitate research in VQE task, we newly construct the Video Coding Priors (VCP) dataset, comprising 300 videos with various coding priors extracted from corresponding bitstreams. It remedies the shortage of previous datasets on the lack of coding information. Experimental results demonstrate the superiority of our method compared to existing state-of-the-art methods. The code and dataset will be released at https://github.com/CPGA/CPGA.git.
翻译:近期,众多方法在压缩视频质量增强(VQE)领域取得了显著成功。然而,这些方法通常忽略了压缩视频中固有用价值的编码先验信息(如运动矢量和残差帧)的利用,而这些信息承载着丰富的时域与空域特征。针对此问题,我们提出编码先验引导聚合(CPGA)网络,旨在充分利用编码先验中的时空信息。CPGA主要由帧间时域聚合(ITA)模块与多尺度非局部聚合(MNA)模块构成。具体而言,ITA模块通过连续帧与编码先验聚合时域信息,而MNA模块则借助残差帧引导全局捕获空域特征。此外,为促进VQE任务研究,我们新构建了视频编码先验(VCP)数据集,包含300个视频及其对应码流中提取的多种编码先验信息,弥补了现有数据集在编码信息缺失方面的不足。实验结果表明,与现有最优方法相比,本方法具有显著优越性。代码与数据集将发布于https://github.com/CPGA/CPGA.git。