Neural Radiance Field (NeRF) has achieved substantial progress in novel view synthesis given multi-view images. Recently, some works have attempted to train a NeRF from a single image with 3D priors. They mainly focus on a limited field of view and there are few invisible occlusions, which greatly limits their scalability to real-world 360-degree panoramic scenarios with large-size occlusions. In this paper, we present PERF, a 360-degree novel view synthesis framework that trains a panoramic neural radiance field from a single panorama. Notably, PERF allows 3D roaming in a complex scene without expensive and tedious image collection. To achieve this goal, we propose a novel collaborative RGBD inpainting method and a progressive inpainting-and-erasing method to lift up a 360-degree 2D scene to a 3D scene. Specifically, we first predict a panoramic depth map as initialization given a single panorama, and reconstruct visible 3D regions with volume rendering. Then we introduce a collaborative RGBD inpainting approach into a NeRF for completing RGB images and depth maps from random views, which is derived from an RGB Stable Diffusion model and a monocular depth estimator. Finally, we introduce an inpainting-and-erasing strategy to avoid inconsistent geometry between a newly-sampled view and reference views. The two components are integrated into the learning of NeRFs in a unified optimization framework and achieve promising results. Extensive experiments on Replica and a new dataset PERF-in-the-wild demonstrate the superiority of our PERF over state-of-the-art methods. Our PERF can be widely used for real-world applications, such as panorama-to-3D, text-to-3D, and 3D scene stylization applications. Project page and code are available at https://perf-project.github.io/.
翻译:神经辐射场(NeRF)在多视角图像的新视角合成任务中取得了显著进展。近期,部分研究尝试利用3D先验从单张图像训练NeRF,但主要局限于有限视野且场景中几乎没有大范围遮挡,这极大限制了其在包含大尺寸遮挡的现实360度全景场景中的可扩展性。本文提出PERF——一个从单张全景图像训练全景神经辐射场的360度新视角合成框架。值得注意的是,PERF无需耗时繁琐的图像采集即可实现复杂场景的三维漫游。为实现此目标,我们提出一种新型协作式RGBD修复方法及渐进式修复-擦除方法,将360度二维场景提升为三维场景。具体而言,首先基于单张全景图像预测全景深度图作为初始化,通过体渲染重建可见三维区域;随后将协作式RGBD修复方法引入NeRF,利用RGB稳定扩散模型和单目深度估计器完成随机视角下的彩色图像与深度图修复;最后提出修复-擦除策略以避免新采样视角与参考视角间的几何不一致性。这两部分组件被整合到NeRF的统一优化框架中,取得了令人满意的结果。在Replica数据集及新建PERF-in-the-wild数据集上的大量实验表明,PERF优于当前最优方法。该框架可广泛应用于真实场景任务,如全景转三维、文本转三维及三维场景风格化。项目页面及代码详见https://perf-project.github.io/。