Under good conditions, Neural Radiance Fields (NeRFs) have shown impressive results on novel view synthesis tasks. NeRFs learn a scene's color and density fields by minimizing the photometric discrepancy between training views and differentiable renders of the scene. Once trained from a sufficient set of views, NeRFs can generate novel views from arbitrary camera positions. However, the scene geometry and color fields are severely under-constrained, which can lead to artifacts, especially when trained with few input views. To alleviate this problem we learn a prior over scene geometry and color, using a denoising diffusion model (DDM). Our DDM is trained on RGBD patches of the synthetic Hypersim dataset and can be used to predict the gradient of the logarithm of a joint probability distribution of color and depth patches. We show that, during NeRF training, these gradients of logarithms of RGBD patch priors serve to regularize geometry and color for a scene. During NeRF training, random RGBD patches are rendered and the estimated gradients of the log-likelihood are backpropagated to the color and density fields. Evaluations on LLFF, the most relevant dataset, show that our learned prior achieves improved quality in the reconstructed geometry and improved generalization to novel views. Evaluations on DTU show improved reconstruction quality among NeRF methods.
翻译:在良好条件下,神经辐射场(NeRF)在新视角合成任务中展现出显著效果。NeRF通过最小化训练视图与场景可微分渲染之间的光度差异,学习场景的颜色场和密度场。在充足视角训练后,NeRF可从任意相机位置生成新视角。然而,场景几何与颜色场存在严重欠约束问题,尤其在训练视角不足时会导致伪影。为缓解该问题,我们利用去噪扩散模型(DDM)学习场景几何与颜色的先验知识。该DDM基于合成Hypersim数据集的RGBD图像块进行训练,可预测颜色与深度块联合概率分布的对数梯度。研究表明,在NeRF训练过程中,这些RGBD图像块先验的对数梯度可有效约束场景几何与颜色。具体而言,训练时随机渲染RGBD图像块,将对数似然估计梯度反向传播至颜色场和密度场。在最具相关性的LLFF数据集上的评估表明,我们的学习先验能提升重建几何质量并增强新视角泛化能力。在DTU数据集上的评估显示,该方法在NeRF类方法中实现了更优的重建质量。