There is an emerging effort to combine the two popular 3D frameworks using Multi-View Stereo (MVS) and Neural Implicit Surfaces (NIS) with a specific focus on the few-shot / sparse view setting. In this paper, we introduce a novel integration scheme that combines the multi-view stereo with neural signed distance function representations, which potentially overcomes the limitations of both methods. MVS uses per-view depth estimation and cross-view fusion to generate accurate surfaces, while NIS relies on a common coordinate volume. Based on this strategy, we propose to construct per-view cost frustum for finer geometry estimation, and then fuse cross-view frustums and estimate the implicit signed distance functions to tackle artifacts that are due to noise and holes in the produced surface reconstruction. We further apply a cascade frustum fusion strategy to effectively captures global-local information and structural consistency. Finally, we apply cascade sampling and a pseudo-geometric loss to foster stronger integration between the two architectures. Extensive experiments demonstrate that our method reconstructs robust surfaces and outperforms existing state-of-the-art methods.
翻译:近年来,结合多视图立体(MVS)与神经隐式表面(NIS)两种主流三维框架的研究日益兴起,尤其聚焦于少样本/稀疏视图场景。本文提出一种新颖的集成方案,将多视图立体与神经有符号距离函数表示相结合,有望克服两种方法的局限性。MVS利用逐视图深度估计与跨视图融合生成精确表面,而NIS则依赖于公共坐标空间。基于此策略,我们提出构建逐视图代价平截头体以进行更精细的几何估计,进而融合跨视图平截头体并估计隐式有符号距离函数,以解决表面重建中因噪声和空洞产生的伪影。我们进一步采用级联平截头体融合策略,有效捕获全局-局部信息与结构一致性。最后,应用级联采样与伪几何损失函数,以增强两种架构之间的深度融合。大量实验表明,本方法能够重建鲁棒的表面,并超越现有最先进方法。