Neural Radiance Fields (NeRF) has shown great success in novel view synthesis due to its state-of-the-art quality and flexibility. However, NeRF requires dense input views (tens to hundreds) and a long training time (hours to days) for a single scene to generate high-fidelity images. Although using the voxel grids to represent the radiance field can significantly accelerate the optimization process, we observe that for sparse inputs, the voxel grids are more prone to overfitting to the training views and will have holes and floaters, which leads to artifacts. In this paper, we propose VGOS, an approach for fast (3-5 minutes) radiance field reconstruction from sparse inputs (3-10 views) to address these issues. To improve the performance of voxel-based radiance field in sparse input scenarios, we propose two methods: (a) We introduce an incremental voxel training strategy, which prevents overfitting by suppressing the optimization of peripheral voxels in the early stage of reconstruction. (b) We use several regularization techniques to smooth the voxels, which avoids degenerate solutions. Experiments demonstrate that VGOS achieves state-of-the-art performance for sparse inputs with super-fast convergence. Code will be available at https://github.com/SJoJoK/VGOS.
翻译:神经辐射场(NeRF)由于其在图像质量与灵活性方面的卓越表现,在新视角合成任务中取得了巨大成功。然而,NeRF需要密集的输入视图(数十至数百张)以及单场景数小时至数天的训练时间才能生成高保真图像。尽管使用体素网格表示辐射场可显著加速优化过程,但我们观察到,在稀疏输入场景下,体素网格更易过拟合至训练视图,并出现空洞与漂浮物,导致伪影产生。本文提出VGOS方法,旨在通过稀疏输入(3-10张视图)快速(3-5分钟)重建辐射场以解决上述问题。为增强体素辐射场在稀疏输入场景下的性能,我们提出两种策略:(a)引入渐进式体素训练策略,通过抑制重建初期外周体素的优化来防止过拟合;(b)采用多种正则化技术对体素进行平滑处理,避免退化解。实验表明,VGOS在稀疏输入条件下以超快收敛速度实现了当前最优性能。代码将公开于https://github.com/SJoJoK/VGOS。