In this paper, we propose a novel method for 3D scene and object reconstruction from sparse multi-view images. Different from previous methods that leverage extra information such as depth or generalizable features across scenes, our approach leverages the scene properties embedded in the multi-view inputs to create precise pseudo-labels for optimization without any prior training. Specifically, we introduce a geometry-guided approach that improves surface reconstruction accuracy from sparse views by leveraging spherical harmonics to predict the novel radiance while holistically considering all color observations for a point in the scene. Also, our pipeline exploits proxy geometry and correctly handles the occlusion in generating the pseudo-labels of radiance, which previous image-warping methods fail to avoid. Our method, dubbed Ray Augmentation (RayAug), achieves superior results on DTU and Blender datasets without requiring prior training, demonstrating its effectiveness in addressing the problem of sparse view reconstruction. Our pipeline is flexible and can be integrated into other implicit neural reconstruction methods for sparse views.
翻译:本文提出了一种新颖的方法,用于从稀疏多视角图像中重建3D场景和物体。与先前依赖深度或跨场景可泛化特征等额外信息的方法不同,我们的方法利用多视角输入中嵌入的场景特性,在无需任何预训练的情况下生成精确的伪标签以优化重构过程。具体而言,我们引入了一种几何引导方法,通过球谐函数在综合考虑场景中某一点所有颜色观测的基础上预测新颖辐射场,从而提升稀疏视角下的表面重建精度。同时,我们的处理流程利用代理几何体并正确处理遮挡问题来生成辐射场伪标签,这是先前的图像变形方法无法避免的缺陷。所提出的方法名为射线增强(RayAug),在无需预训练的情况下在DTU和Blender数据集上取得了优异结果,证明了其在解决稀疏视角重建问题上的有效性。该处理流程具有灵活性,可集成到其他隐式神经重建方法中处理稀疏视角场景。