This paper studies the challenging two-view 3D reconstruction in a rigorous sparse-view configuration, which is suffering from insufficient correspondences in the input image pairs for camera pose estimation. We present a novel Neural One-PlanE RANSAC framework (termed NOPE-SAC in short) that exerts excellent capability to learn one-plane pose hypotheses from 3D plane correspondences. Building on the top of a siamese plane detection network, our NOPE-SAC first generates putative plane correspondences with a coarse initial pose. It then feeds the learned 3D plane parameters of correspondences into shared MLPs to estimate the one-plane camera pose hypotheses, which are subsequently reweighed in a RANSAC manner to obtain the final camera pose. Because the neural one-plane pose minimizes the number of plane correspondences for adaptive pose hypotheses generation, it enables stable pose voting and reliable pose refinement in a few plane correspondences for the sparse-view inputs. In the experiments, we demonstrate that our NOPE-SAC significantly improves the camera pose estimation for the two-view inputs with severe viewpoint changes, setting several new state-of-the-art performances on two challenging benchmarks, i.e., MatterPort3D and ScanNet, for sparse-view 3D reconstruction. The source code is released at https://github.com/IceTTTb/NopeSAC for reproducible research.
翻译:本文研究了在严格稀疏视角配置下具有挑战性的两视图三维重建问题,该问题中因输入图像对缺乏足够对应点而难以进行相机位姿估计。我们提出了一种新颖的神经单平面RANSAC框架(简称NOPE-SAC),该框架具备从三维平面对应关系学习单平面位姿假设的卓越能力。基于孪生平面检测网络架构,NOPE-SAC首先利用粗略初始位姿生成候选平面对应关系,随后将学习到的对应关系三维平面参数输入共享多层感知机以估计单平面相机位姿假设,最后通过RANSAC方式对这些假设进行加权重估以获得最终相机位姿。由于神经单平面位姿能通过最小化平面对应数量实现自适应位姿假设生成,该方法在稀疏视角输入下仅需少量平面对应即可实现稳定的位姿投票与可靠的位姿优化。实验表明,我们的NOPE-SAC显著提升了存在剧烈视角变化的两视图输入的相机位姿估计精度,在MatterPort3D和ScanNet两个具有挑战性的基准测试中,针对稀疏视角三维重建任务创造了多项新的最优性能记录。源代码已发布于https://github.com/IceTTTb/NopeSAC以保障研究的可复现性。