Omnidirectional video (ODV) provides an immersive visual experience and is widely utilized in virtual reality and augmented reality. However, restricted capturing devices and transmission bandwidth lead to low-resolution ODVs. Video super-resolution (SR) is proposed to enhance resolution, but practical ODV spatial projection distortions and temporal flickering are not well addressed directly applying existing methods. To achieve better ODV-SR reconstruction, we propose a Spatio-Temporal Distortion Aware Network (STDAN) oriented to ODV characteristics. Specifically, a spatially continuous distortion modulation module is introduced to improve discrete projection distortions. Next, we design an interlaced multi-frame reconstruction mechanism to refine temporal consistency across frames. Furthermore, we incorporate latitude-saliency adaptive weights during training to concentrate on regions with higher texture complexity and human-watching interest. In general, we explore inference-free and real-world viewing matched strategies to provide an application-friendly method on a novel ODV-SR dataset with practical scenarios. Extensive experimental results demonstrate the superior performance of the proposed STDAN over state-of-the-art methods.
翻译:全向视频(ODV)提供沉浸式视觉体验,广泛应用于虚拟现实和增强现实领域。然而,受限的采集设备与传输带宽导致全向视频分辨率低下。视频超分辨率技术虽被提出用于提升分辨率,但现有方法直接应用时未能有效处理全向视频固有的空间投影失真与时域闪烁问题。为实现更优的全向视频超分辨率重建,本文提出面向全向视频特性的时空失真感知网络(STDAN)。具体而言,我们引入空间连续失真调制模块以改善离散投影失真;其次设计交错多帧重建机制以优化帧间时序一致性;进一步在训练中融入纬度显著性自适应权重,使模型聚焦于纹理复杂度更高且符合人类观看兴趣的区域。整体上,我们探索了免推理且匹配真实观看场景的策略,在包含实际场景的新型全向视频超分辨率数据集上提供了应用友好的解决方案。大量实验结果表明,所提出的STDAN方法在性能上显著优于现有先进方法。