As a widely studied task, video restoration aims to enhance the quality of the videos with multiple potential degradations, such as noises, blurs and compression artifacts. Among video restorations, compressed video quality enhancement and video super-resolution are two of the main tacks with significant values in practical scenarios. Recently, recurrent neural networks and transformers attract increasing research interests in this field, due to their impressive capability in sequence-to-sequence modeling. However, the training of these models is not only costly but also relatively hard to converge, with gradient exploding and vanishing problems. To cope with these problems, we proposed a two-stage framework including a multi-frame recurrent network and a single-frame transformer. Besides, multiple training strategies, such as transfer learning and progressive training, are developed to shorten the training time and improve the model performance. Benefiting from the above technical contributions, our solution wins two champions and a runner-up in the NTIRE 2022 super-resolution and quality enhancement of compressed video challenges. Code is available at https://github.com/ryanxingql/winner-ntire22-vqe.
翻译:视频恢复作为一项广泛研究的任务,旨在提升受多种潜在退化(如噪声、模糊和压缩伪影)影响的视频质量。在视频恢复中,压缩视频质量增强和视频超分辨率是两项在实际场景中具有重要价值的主要任务。近年来,循环神经网络和Transformer因其在序列到序列建模方面的卓越能力在该领域引起了越来越多的研究兴趣。然而,这些模型的训练不仅成本高昂,而且由于梯度爆炸和梯度消失问题,相对难以收敛。为了解决这些问题,我们提出了一种包含多帧循环网络和单帧Transformer的两阶段框架。此外,我们还开发了多种训练策略,如迁移学习和渐进式训练,以缩短训练时间并提升模型性能。得益于上述技术贡献,我们的方案在NTIRE 2022压缩视频超分辨率与质量增强挑战赛中赢得了两项冠军和一项亚军。代码已开源至 https://github.com/ryanxingql/winner-ntire22-vqe。