PanoGRF: Generalizable Spherical Radiance Fields for Wide-baseline Panoramas

Achieving an immersive experience enabling users to explore virtual environments with six degrees of freedom (6DoF) is essential for various applications such as virtual reality (VR). Wide-baseline panoramas are commonly used in these applications to reduce network bandwidth and storage requirements. However, synthesizing novel views from these panoramas remains a key challenge. Although existing neural radiance field methods can produce photorealistic views under narrow-baseline and dense image captures, they tend to overfit the training views when dealing with \emph{wide-baseline} panoramas due to the difficulty in learning accurate geometry from sparse $360^{\circ}$ views. To address this problem, we propose PanoGRF, Generalizable Spherical Radiance Fields for Wide-baseline Panoramas, which construct spherical radiance fields incorporating $360^{\circ}$ scene priors. Unlike generalizable radiance fields trained on perspective images, PanoGRF avoids the information loss from panorama-to-perspective conversion and directly aggregates geometry and appearance features of 3D sample points from each panoramic view based on spherical projection. Moreover, as some regions of the panorama are only visible from one view while invisible from others under wide baseline settings, PanoGRF incorporates $360^{\circ}$ monocular depth priors into spherical depth estimation to improve the geometry features. Experimental results on multiple panoramic datasets demonstrate that PanoGRF significantly outperforms state-of-the-art generalizable view synthesis methods for wide-baseline panoramas (e.g., OmniSyn) and perspective images (e.g., IBRNet, NeuRay).

翻译：实现沉浸式体验，使用户能够以六自由度（6DoF）探索虚拟环境，对于虚拟现实（VR）等应用至关重要。宽基线全景图常被用于此类应用以降低网络带宽和存储需求，然而，从这些全景图中合成新视角仍是一项关键挑战。尽管现有神经辐射场方法在窄基线、密集图像采集条件下能生成逼真视图，但在处理宽基线全景图时，由于难以从稀疏的$360^{\circ}$视图中学习精确几何结构，这些方法容易对训练视图过拟合。为解决此问题，我们提出PanoGRF——面向宽基线全景的可泛化球面辐射场，该方法构建了融入$360^{\circ}$场景先验的球面辐射场。与基于透视图像训练的可泛化辐射场不同，PanoGRF避免了全景图到透视图转换中的信息损失，并基于球面投影直接聚合每个全景视图中三维样本点的几何与外观特征。此外，在宽基线设置下，全景图的某些区域仅能从单一视图可见而从其他视图不可见，因此PanoGRF将$360^{\circ}$单目深度先验融入球面深度估计中，以改进几何特征。在多个全景数据集上的实验结果表明，PanoGRF在宽基线全景图（如OmniSyn）和透视图像（如IBRNet、NeuRay）的可泛化视图合成方法中，均显著优于现有最先进方法。