Omnidirectional images (ODIs) have obtained lots of research interest for immersive experiences. Although ODIs require extremely high resolution to capture details of the entire scene, the resolutions of most ODIs are insufficient. Previous methods attempt to solve this issue by image super-resolution (SR) on equirectangular projection (ERP) images. However, they omit geometric properties of ERP in the degradation process, and their models can hardly generalize to real ERP images. In this paper, we propose Fisheye downsampling, which mimics the real-world imaging process and synthesizes more realistic low-resolution samples. Then we design a distortion-aware Transformer (OSRT) to modulate ERP distortions continuously and self-adaptively. Without a cumbersome process, OSRT outperforms previous methods by about 0.2dB on PSNR. Moreover, we propose a convenient data augmentation strategy, which synthesizes pseudo ERP images from plain images. This simple strategy can alleviate the over-fitting problem of large networks and significantly boost the performance of ODISR. Extensive experiments have demonstrated the state-of-the-art performance of our OSRT. Codes and models will be available at https://github.com/Fanghua-Yu/OSRT.
翻译:全向图像在沉浸式体验中获得了大量研究关注。尽管全向图像需要极高的分辨率来捕捉整个场景的细节,但大多数全向图像的分辨率并不充足。以往方法尝试通过等距柱状投影图像上的超分辨率来解决这一问题。然而,这些方法在降质过程中忽略了ERP的几何特性,其模型难以泛化到真实ERP图像。本文提出鱼眼降采样方法,通过模拟真实成像过程生成更逼真的低分辨率样本。随后设计了一个畸变感知Transformer(OSRT),以连续且自适应方式调制ERP图像的畸变。无需繁琐流程,OSRT在PSNR上较以往方法提升约0.2dB。此外,我们提出一种便捷的数据增强策略,能从普通图像合成伪ERP图像。该简单策略可缓解大网络过拟合问题,并显著提升全向图像超分辨率性能。大量实验证明了OSRT的最优性能。代码与模型将开源至https://github.com/Fanghua-Yu/OSRT。