Recent advancements in real-time super-resolution have enabled higher-quality video streaming, yet existing methods struggle with the unique challenges of compressed video content. Commonly used datasets do not accurately reflect the characteristics of streaming media, limiting the relevance of current benchmarks. To address this gap, we introduce a comprehensive dataset - StreamSR - sourced from YouTube, covering a wide range of video genres and resolutions representative of real-world streaming scenarios. We benchmark 11 state-of-the-art real-time super-resolution models to evaluate their performance for the streaming use-case. Furthermore, we propose EfRLFN, an efficient real-time model that integrates Efficient Channel Attention and a hyperbolic tangent activation function - a novel design choice in the context of real-time super-resolution. We extensively optimized the architecture to maximize efficiency and designed a composite loss function that improves training convergence. EfRLFN combines the strengths of existing architectures while improving both visual quality and runtime performance. Finally, we show that fine-tuning other models on our dataset results in significant performance gains that generalize well across various standard benchmarks. We made the dataset, the code, and the benchmark available at https://github.com/EvgeneyBogatyrev/EfRLFN.
翻译:近年来,实时超分辨率技术的进步使得更高质量的视频流传输成为可能,然而现有方法在处理压缩视频内容所特有的挑战时仍显不足。常用的数据集未能准确反映流媒体的特征,限制了当前基准测试的相关性。为填补这一空白,我们引入了一个全面的数据集——StreamSR,其数据来源于YouTube,涵盖了代表真实世界流媒体场景的广泛视频类型和分辨率。我们对11种最先进的实时超分辨率模型进行了基准测试,以评估它们在流媒体应用场景下的性能。此外,我们提出了EfRLFN,一种高效的实时模型,它集成了高效通道注意力机制和双曲正切激活函数——这是实时超分辨率领域一种新颖的设计选择。我们对架构进行了深入优化以最大化效率,并设计了一种复合损失函数以改善训练收敛性。EfRLFN结合了现有架构的优势,同时提升了视觉质量和运行时性能。最后,我们展示了在其他模型上使用我们的数据集进行微调,能带来显著的性能提升,并且这种提升在各种标准基准测试中具有良好的泛化能力。我们已在https://github.com/EvgeneyBogatyrev/EfRLFN上公开了数据集、代码和基准测试。