Video super-resolution (VSR) techniques, especially deep-learning-based algorithms, have drastically improved over the last few years and shown impressive performance on synthetic data. However, their performance on real-world video data suffers because of the complexity of real-world degradations and misaligned video frames. Since obtaining a synthetic dataset consisting of low-resolution (LR) and high-resolution (HR) frames are easier than obtaining real-world LR and HR images, in this paper, we propose synthesizing real-world degradations on synthetic training datasets. The proposed synthetic real-world degradations (SRWD) include a combination of the blur, noise, downsampling, pixel binning, and image and video compression artifacts. We then propose using a random shuffling-based strategy to simulate these degradations on the training datasets and train a single end-to-end deep neural network (DNN) on the proposed larger variation of realistic synthesized training data. Our quantitative and qualitative comparative analysis shows that the proposed training strategy using diverse realistic degradations improves the performance by 7.1 % in terms of NRQM compared to RealBasicVSR and by 3.34 % compared to BSRGAN on the VideoLQ dataset. We also introduce a new dataset that contains high-resolution real-world videos that can serve as a common ground for bench-marking.
翻译:视频超分辨率技术,特别是基于深度学习的算法,在过去几年中取得了显著进步,并在合成数据上展现出令人印象深刻的表现。然而,由于真实世界退化的复杂性以及视频帧的未对齐问题,这些技术在真实视频数据上的性能仍不理想。鉴于获取包含低分辨率和对应高分辨率帧的合成数据集比获取真实世界的低分辨率和高分辨率图像更为容易,本文提出在合成训练数据集上模拟真实世界退化。所提出的合成真实世界退化包括模糊、噪声、下采样、像素合并以及图像和视频压缩伪影的组合。随后,我们采用基于随机洗牌的策略在训练数据集上模拟这些退化,并在所提出的较大变异性真实感合成训练数据上训练单个端到端深度神经网络。定量和定性对比分析表明,所提出的利用多样真实退化的训练策略在VideoLQ数据集上,相较于RealBasicVSR在NRQM指标上提升了7.1%,相较于BSRGAN提升了3.34%。此外,我们还引入了一个包含高分辨率真实世界视频的新数据集,可作为基准测试的公共标准。