Neural surface reconstruction is sensitive to the camera pose noise, even if state-of-the-art pose estimators like COLMAP or ARKit are used. More importantly, existing Pose-NeRF joint optimisation methods have struggled to improve pose accuracy in challenging real-world scenarios. To overcome the challenges, we introduce the pose residual field (\textbf{PoRF}), a novel implicit representation that uses an MLP for regressing pose updates. This is more robust than the conventional pose parameter optimisation due to parameter sharing that leverages global information over the entire sequence. Furthermore, we propose an epipolar geometry loss to enhance the supervision that leverages the correspondences exported from COLMAP results without the extra computational overhead. Our method yields promising results. On the DTU dataset, we reduce the rotation error by 78\% for COLMAP poses, leading to the decreased reconstruction Chamfer distance from 3.48mm to 0.85mm. On the MobileBrick dataset that contains casually captured unbounded 360-degree videos, our method refines ARKit poses and improves the reconstruction F1 score from 69.18 to 75.67, outperforming that with the dataset provided ground-truth pose (75.14). These achievements demonstrate the efficacy of our approach in refining camera poses and improving the accuracy of neural surface reconstruction in real-world scenarios.
翻译:神经表面重建对相机姿态噪声非常敏感,即使采用COLMAP或ARKit等最先进的姿态估计器也是如此。更重要的是,现有的Pose-NeRF联合优化方法在挑战性真实场景中难以提升姿态精度。为克服这些挑战,我们提出姿态残差场(PoRF)——一种新型隐式表示,通过MLP回归姿态更新。由于参数共享能够利用整个序列的全局信息,该方法比传统姿态参数优化更具鲁棒性。此外,我们提出一种对极几何损失函数,通过利用COLMAP结果导出的对应关系增强监督,且不引入额外计算开销。我们的方法取得了显著成果。在DTU数据集上,我们将COLMAP姿态的旋转误差降低78%,使重建倒角距离从3.48mm降至0.85mm。在包含随意拍摄的无边界360度视频的MobileBrick数据集上,我们的方法优化了ARKit姿态,将重建F1分数从69.18提升至75.67,优于使用该数据集提供真值姿态的结果(75.14)。这些成果证明了本方法在优化相机姿态及提升真实场景中神经表面重建精度方面的有效性。