3D reconstruction methods such as Neural Radiance Fields (NeRFs) excel at rendering photorealistic novel views of complex scenes. However, recovering a high-quality NeRF typically requires tens to hundreds of input images, resulting in a time-consuming capture process. We present ReconFusion to reconstruct real-world scenes using only a few photos. Our approach leverages a diffusion prior for novel view synthesis, trained on synthetic and multiview datasets, which regularizes a NeRF-based 3D reconstruction pipeline at novel camera poses beyond those captured by the set of input images. Our method synthesizes realistic geometry and texture in underconstrained regions while preserving the appearance of observed regions. We perform an extensive evaluation across various real-world datasets, including forward-facing and 360-degree scenes, demonstrating significant performance improvements over previous few-view NeRF reconstruction approaches.
翻译:诸如神经辐射场(NeRF)等三维重建方法在渲染复杂场景的照片级真实新视角方面表现出色。然而,恢复高质量NeRF通常需要数十至数百张输入图像,导致采集过程耗时。本文提出ReconFusion,仅需少量照片即可重建真实场景。该方法利用扩散先验进行新视角合成,该先验基于合成数据集和多视角数据集训练,能够在输入图像捕获视角之外的新相机位姿上对NeRF三维重建管道进行正则化。在欠约束区域,我们的方法能合成合理的几何与纹理,同时保留观测区域的外观。我们在多种真实场景数据集上进行了全面评估,涵盖前向场景和360度场景,结果表明该方法相较于以往少视角NeRF重建方法取得了显著性能提升。