Exploiting spatial-angular correlation is crucial to light field (LF) image super-resolution (SR), but is highly challenging due to its non-local property caused by the disparities among LF images. Although many deep neural networks (DNNs) have been developed for LF image SR and achieved continuously improved performance, existing methods cannot well leverage the long-range spatial-angular correlation and thus suffer a significant performance drop when handling scenes with large disparity variations. In this paper, we propose a simple yet effective method to learn the non-local spatial-angular correlation for LF image SR. In our method, we adopt the epipolar plane image (EPI) representation to project the 4D spatial-angular correlation onto multiple 2D EPI planes, and then develop a Transformer network with repetitive self-attention operations to learn the spatial-angular correlation by modeling the dependencies between each pair of EPI pixels. Our method can fully incorporate the information from all angular views while achieving a global receptive field along the epipolar line. We conduct extensive experiments with insightful visualizations to validate the effectiveness of our method. Comparative results on five public datasets show that our method not only achieves state-of-the-art SR performance, but also performs robust to disparity variations. Code is publicly available at https://github.com/ZhengyuLiang24/EPIT.
翻译:利用空间-角度相关性对于光场图像超分辨率至关重要,但由于光场图像间视差导致的非局部特性,这一任务极具挑战性。尽管已有多种深度神经网络被开发用于光场图像超分辨率并持续提升性能,现有方法仍难以有效利用长程空间-角度相关性,因此在处理具有大视差变化的场景时性能显著下降。本文提出一种简洁而有效的方法,用于学习光场图像超分辨率的非局部空间-角度相关性。该方法采用极平面图像表示将四维空间-角度相关性投影到多个二维极平面图像上,进而开发一种配备重复自注意力操作的Transformer网络,通过建模极平面图像像素间的依赖关系来学习空间-角度相关性。该方法能充分融合所有角度视图的信息,同时沿极线方向获得全局感受野。我们通过深入的实验和可视化分析验证了方法的有效性。在五个公开数据集上的对比结果表明,该方法不仅实现了最先进的超分辨率性能,而且对视差变化具有鲁棒性。代码已开源至https://github.com/ZhengyuLiang24/EPIT。