As virtual and augmented reality applications gain popularity, omnidirectional image (ODI) super-resolution has become increasingly important. Unlike 2D plain images that are formed on a plane, ODIs are projected onto spherical surfaces. Applying established image super-resolution methods to ODIs, therefore, requires performing equirectangular projection (ERP) to map the ODIs onto a plane. ODI super-resolution needs to take into account geometric distortion resulting from ERP. However, without considering such geometric distortion of ERP images, previous deep-learning-based methods only utilize a limited range of pixels and may easily miss self-similar textures for reconstruction. In this paper, we introduce a novel Geometric Distortion Guided Transformer for Omnidirectional image Super-Resolution (GDGT-OSR). Specifically, a distortion modulated rectangle-window self-attention mechanism, integrated with deformable self-attention, is proposed to better perceive the distortion and thus involve more self-similar textures. Distortion modulation is achieved through a newly devised distortion guidance generator that produces guidance by exploiting the variability of distortion across latitudes. Furthermore, we propose a dynamic feature aggregation scheme to adaptively fuse the features from different self-attention modules. We present extensive experimental results on public datasets and show that the new GDGT-OSR outperforms methods in existing literature.
翻译:随着虚拟现实和增强现实应用的普及,全景图像超分辨率技术的重要性日益凸显。与二维平面图像不同,全景图像被投影到球体表面上。因此,将现有的图像超分辨率方法应用于全景图像时,需要进行等距柱状投影以将全景图像映射到平面上。全景图像超分辨率必须考虑由等距柱状投影引起的几何失真。然而,先前基于深度学习的方法未考虑等距柱状投影图像的这种几何失真,仅利用了有限的像素范围,容易在重建过程中遗漏自相似纹理。本文提出了一种新颖的几何失真引导的全景图像超分辨率Transformer。具体而言,我们提出了一种失真调制矩形窗口自注意力机制,该机制与可变形自注意力相结合,以更好地感知失真,从而纳入更多自相似纹理。失真调制通过新设计的失真引导生成器实现,该生成器利用不同纬度失真的变化性来产生引导信息。此外,我们提出了一种动态特征聚合方案,以自适应地融合来自不同自注意力模块的特征。我们在公开数据集上进行了广泛的实验,结果表明新的GDGT-OSR方法优于现有文献中的方法。