Temporal prediction is one of the most important technologies for video compression. Various prediction coding modes are designed in traditional video codecs. Traditional video codecs will adaptively to decide the optimal coding mode according to the prediction quality and reference quality. Recently, learned video codecs have made great progress. However, they ignore the prediction and reference quality adaptation, which leads to incorrect utilization of temporal prediction and reconstruction error propagation. Therefore, in this paper, we first propose a confidence-based prediction quality adaptation (PQA) module to provide explicit discrimination for the spatial and channel-wise prediction quality difference. With this module, the prediction with low quality will be suppressed and that with high quality will be enhanced. The codec can adaptively decide which spatial or channel location of predictions to use. Then, we further propose a reference quality adaptation (RQA) module and an associated repeat-long training strategy to provide dynamic spatially variant filters for diverse reference qualities. With the filters, it is easier for our codec to achieve the target reconstruction quality according to reference qualities, thus reducing the propagation of reconstruction errors. Experimental results show that our codec obtains higher compression performance than the reference software of H.266/VVC and the previous state-of-the-art learned video codecs in both RGB and YUV420 colorspaces.
翻译:时间预测是视频压缩中最为关键的技术之一。传统视频编解码器设计了多种预测编码模式,并会根据预测质量与参考质量自适应地选择最优编码模式。近年来,学习式视频编解码器取得了显著进展,但其往往忽略了预测质量与参考质量的动态适配,导致时间预测利用不当及重建误差传播。为此,本文首先提出一种基于置信度的预测质量自适应模块,为空间与通道维度的预测质量差异提供显式判别机制。通过该模块,低质量预测将被抑制而高质量预测得到增强,编解码器能够自适应地决定采用预测结果的哪些空间或通道位置。进一步,我们提出参考质量自适应模块及配套的长周期重复训练策略,为多样化的参考质量提供动态的空间可变滤波器。借助这些滤波器,我们的编解码器能依据参考质量更有效地达成目标重建质量,从而减少重建误差的传播。实验结果表明,在RGB与YUV420色彩空间下,本编解码器的压缩性能均优于H.266/VVC参考软件及此前最先进的学习式视频编解码器。