In this paper, we propose a novel method for 3D scene and object reconstruction from sparse multi-view images. Different from previous methods that leverage extra information such as depth or generalizable features across scenes, our approach leverages the scene properties embedded in the multi-view inputs to create precise pseudo-labels for optimization without any prior training. Specifically, we introduce a geometry-guided approach that improves surface reconstruction accuracy from sparse views by leveraging spherical harmonics to predict the novel radiance while holistically considering all color observations for a point in the scene. Also, our pipeline exploits proxy geometry and correctly handles the occlusion in generating the pseudo-labels of radiance, which previous image-warping methods fail to avoid. Our method, dubbed Ray Augmentation (RayAug), achieves superior results on DTU and Blender datasets without requiring prior training, demonstrating its effectiveness in addressing the problem of sparse view reconstruction. Our pipeline is flexible and can be integrated into other implicit neural reconstruction methods for sparse views.
翻译:本文提出一种新颖的稀疏多视角图像三维场景与物体重建方法。与现有依赖深度信息或跨场景可泛化特征等额外信息的方法不同,本方法利用多视角输入中嵌入的场景特性,无需任何预训练即可生成精确的伪标签进行优化。具体而言,我们提出几何引导方法,通过球谐函数预测新视角辐射度,并综合考量场景中某一点的所有颜色观测值,从而提升稀疏视角下的表面重建精度。同时,本文管线利用代理几何体正确解决伪标签生成过程中的遮挡问题,这正是现有图像扭曲方法无法避免的缺陷。所提出的光线增强方法在DTU和Blender数据集上无需预训练即可取得优异结果,充分证明了其在解决稀疏视角重建问题上的有效性。该管线具有灵活性,可集成至其他隐式神经重建方法中用于稀疏视角重建。