Neural Radiance Fields (NeRF) has shown great success in novel view synthesis due to its state-of-the-art quality and flexibility. However, NeRF requires dense input views (tens to hundreds) and a long training time (hours to days) for a single scene to generate high-fidelity images. Although using the voxel grids to represent the radiance field can significantly accelerate the optimization process, we observe that for sparse inputs, the voxel grids are more prone to overfitting to the training views and will have holes and floaters, which leads to artifacts. In this paper, we propose VGOS, an approach for fast (3-5 minutes) radiance field reconstruction from sparse inputs (3-10 views) to address these issues. To improve the performance of voxel-based radiance field in sparse input scenarios, we propose two methods: (a) We introduce an incremental voxel training strategy, which prevents overfitting by suppressing the optimization of peripheral voxels in the early stage of reconstruction. (b) We use several regularization techniques to smooth the voxels, which avoids degenerate solutions. Experiments demonstrate that VGOS achieves state-of-the-art performance for sparse inputs with super-fast convergence. Code will be available at https://github.com/SJoJoK/VGOS.
翻译:神经辐射场(NeRF)凭借其卓越的质量与灵活性在新视角合成领域取得了巨大成功。然而,NeRF需要密集的输入视角(数十至数百个)以及漫长的单场景训练时间(数小时至数天)才能生成高保真图像。尽管使用体素网格表示辐射场能够显著加速优化过程,但我们观察到,在稀疏输入条件下,体素网格更易过度拟合训练视角,产生空洞与漂浮伪影,导致成像瑕疵。针对这些问题,本文提出VGOS方法,可在3-5分钟内从稀疏输入(3-10个视角)快速重建辐射场。为提升基于体素辐射场在稀疏输入场景中的表现,我们提出两种策略:(a) 引入渐进式体素训练策略,通过在重建早期抑制外围体素优化来防止过拟合;(b) 采用多种正则化技术平滑体素,避免退化解。实验表明,VGOS在稀疏输入条件下以超快收敛速度实现了最优性能。代码开源于 https://github.com/SJoJoK/VGOS。