Achieving an immersive experience enabling users to explore virtual environments with six degrees of freedom (6DoF) is essential for various applications such as virtual reality (VR). Wide-baseline panoramas are commonly used in these applications to reduce network bandwidth and storage requirements. However, synthesizing novel views from these panoramas remains a key challenge. Although existing neural radiance field methods can produce photorealistic views under narrow-baseline and dense image captures, they tend to overfit the training views when dealing with \emph{wide-baseline} panoramas due to the difficulty in learning accurate geometry from sparse $360^{\circ}$ views. To address this problem, we propose PanoGRF, Generalizable Spherical Radiance Fields for Wide-baseline Panoramas, which construct spherical radiance fields incorporating $360^{\circ}$ scene priors. Unlike generalizable radiance fields trained on perspective images, PanoGRF avoids the information loss from panorama-to-perspective conversion and directly aggregates geometry and appearance features of 3D sample points from each panoramic view based on spherical projection. Moreover, as some regions of the panorama are only visible from one view while invisible from others under wide baseline settings, PanoGRF incorporates $360^{\circ}$ monocular depth priors into spherical depth estimation to improve the geometry features. Experimental results on multiple panoramic datasets demonstrate that PanoGRF significantly outperforms state-of-the-art generalizable view synthesis methods for wide-baseline panoramas (e.g., OmniSyn) and perspective images (e.g., IBRNet, NeuRay).
翻译:实现沉浸式体验,使用户能够以六自由度(6DoF)探索虚拟环境,对于虚拟现实(VR)等应用至关重要。宽基线全景图常用于此类应用,以降低网络带宽和存储需求。然而,从这些全景图中合成新视角仍然是一项关键挑战。现有神经辐射场方法虽然在窄基线和密集图像采集下能生成逼真视图,但在处理宽基线全景图时,由于难以从稀疏的360°视图中学习精确几何,往往过拟合训练视图。为解决这一问题,我们提出PanoGRF——面向宽基线全景图的泛化球面辐射场,该方法构建融合360°场景先验的球面辐射场。不同于基于透视图像训练的泛化辐射场,PanoGRF避免了全景图到透视图转换中的信息损失,并基于球面投影直接从每个全景视图中聚合三维采样点的几何与外观特征。此外,针对宽基线设置下全景图的某些区域仅单个视图可见而其他视图不可见的问题,PanoGRF将360°单目深度先验融入球面深度估计以改进几何特征。在多个全景数据集上的实验结果表明,PanoGRF显著优于面向宽基线全景图(如OmniSyn)和透视图像(如IBRNet、NeuRay)的现有最先进泛化视图合成方法。