Reconstructing accurate surfaces from sparse multi-view images remains challenging due to severe geometric ambiguity and occlusions. Existing generalizable neural surface reconstruction methods primarily rely on cost volumes that summarize multi-view features using simple statistics (e.g., mean and variance), which discard critical view-dependent geometric structure and often lead to over-smoothed reconstructions. We propose EpiS, a generalizable neural surface reconstruction framework that explicitly leverages epipolar geometry for sparse-view inputs. Instead of directly regressing geometry from cost-volume statistics, EpiS uses coarse cost-volume features to guide the aggregation of fine-grained epipolar features sampled along corresponding epipolar lines across source views. An epipolar transformer fuses multi-view information, followed by ray-wise aggregation to produce SDF-aware features for surface estimation. To further mitigate information loss under sparse views, we introduce a geometry regularization strategy that leverages a pretrained monocular depth model through scale-invariant global and local constraints. Extensive experiments on DTU and BlendedMVS demonstrate that EpiS significantly outperforms state-of-the-art generalizable surface reconstruction methods under sparse-view settings, while maintaining strong generalization without per-scene optimization.
翻译:从稀疏多视角图像中重建精确表面仍具挑战性,主要由于严重的几何歧义和遮挡问题。现有的通用神经表面重建方法主要依赖代价体(Cost Volume),通过简单统计量(如均值和方差)汇总多视角特征,但这些方法丢弃了关键的视角相关几何结构,常导致重建结果过于平滑。我们提出EpiS——一种显式利用极线几何处理稀疏视图输入的通用神经表面重建框架。与直接从代价体统计量回归几何不同,EpiS利用粗粒度代价体特征引导沿源视图对应极线采样的细粒度极线特征聚合。通过极线变压器融合多视角信息,再经射线级聚合生成用于表面估计的SDF感知特征。为缓解稀疏视图下的信息损失,我们引入几何正则化策略,利用预训练的单目深度模型通过尺度不变的全局与局部约束进行优化。在DTU和BlendedMVS上的大量实验表明,EpiS在稀疏视图设置下显著优于现有最先进的通用表面重建方法,且无需逐场景优化即可保持强泛化能力。