Neural Radiance Fields (NeRF) has demonstrated remarkable 3D reconstruction capabilities with dense view images. However, its performance significantly deteriorates under sparse view settings. We observe that learning the 3D consistency of pixels among different views is crucial for improving reconstruction quality in such cases. In this paper, we propose ConsistentNeRF, a method that leverages depth information to regularize both multi-view and single-view 3D consistency among pixels. Specifically, ConsistentNeRF employs depth-derived geometry information and a depth-invariant loss to concentrate on pixels that exhibit 3D correspondence and maintain consistent depth relationships. Extensive experiments on recent representative works reveal that our approach can considerably enhance model performance in sparse view conditions, achieving improvements of up to 94% in PSNR, 76% in SSIM, and 31% in LPIPS compared to the vanilla baselines across various benchmarks, including DTU, NeRF Synthetic, and LLFF.
翻译:神经辐射场(NeRF)在密集视角图像下展现了卓越的三维重建能力。然而,在稀疏视角设置下其性能显著下降。我们观察到,学习不同视角间像素的三维一致性对于在此类情况下提升重建质量至关重要。本文提出ConsistentNeRF,一种利用深度信息约束像素间多视角与单视角三维一致性的方法。具体而言,ConsistentNeRF借助深度导出的几何信息及深度不变损失,聚焦于具备三维对应关系且保持一致深度关系的像素。在近期代表性工作上的大量实验表明,我们的方法能显著提升稀疏视角条件下模型性能,在DTU、NeRF Synthetic与LLFF等多个基准测试中,相较于原始基线,PSNR提升最高达94%,SSIM提升76%,LPIPS提升31%。