Casually captured Neural Radiance Fields (NeRFs) suffer from artifacts such as floaters or flawed geometry when rendered outside the camera trajectory. Existing evaluation protocols often do not capture these effects, since they usually only assess image quality at every 8th frame of the training capture. To push forward progress in novel-view synthesis, we propose a new dataset and evaluation procedure, where two camera trajectories are recorded of the scene: one used for training, and the other for evaluation. In this more challenging in-the-wild setting, we find that existing hand-crafted regularizers do not remove floaters nor improve scene geometry. Thus, we propose a 3D diffusion-based method that leverages local 3D priors and a novel density-based score distillation sampling loss to discourage artifacts during NeRF optimization. We show that this data-driven prior removes floaters and improves scene geometry for casual captures.
翻译:随意拍摄的神经辐射场(NeRFs)在相机轨迹外渲染时会出现如漂浮物或几何畸变等伪影。现有评估协议通常无法捕捉这些效应,因为它们仅评估训练拍摄中每第8帧的图像质量。为推动新视角合成领域的进展,我们提出了一种新的数据集和评估流程:记录场景的两条相机轨迹——一条用于训练,另一条用于评估。在这种更具挑战性的野外场景中,我们发现现有手工设计的正则化方法既无法消除漂浮物,也无法改善场景几何。因此,我们提出了一种基于3D扩散的方法,利用局部3D先验和新型基于密度的分数蒸馏采样损失,以抑制NeRF优化过程中的伪影。实验表明,这种数据驱动的先验能有效消除浮游物并改善随意拍摄场景的几何质量。