Absolute Pose Regression (APR) methods use deep neural networks to directly regress camera poses from RGB images. Despite their advantages in inference speed and simplicity, these methods still fall short of the accuracy achieved by geometry-based techniques. To address this issue, we propose a new model called the Neural Feature Synthesizer (NeFeS). Our approach encodes 3D geometric features during training and renders dense novel view features at test time to refine estimated camera poses from arbitrary APR methods. Unlike previous APR works that require additional unlabeled training data, our method leverages implicit geometric constraints during test time using a robust feature field. To enhance the robustness of our NeFeS network, we introduce a feature fusion module and a progressive training strategy. Our proposed method improves the state-of-the-art single-image APR accuracy by as much as 54.9% on indoor and outdoor benchmark datasets without additional time-consuming unlabeled data training.
翻译:绝对位姿回归(APR)方法利用深度神经网络直接从RGB图像回归相机位姿。尽管这些方法在推理速度和简洁性方面具有优势,但其准确性仍不及基于几何的技术。为解决这一问题,我们提出了一种名为神经特征合成器(NeFeS)的新模型。该方法在训练过程中编码三维几何特征,并在测试时渲染密集的新视角特征,以优化任意APR方法估计的相机位姿。与需要额外无标注训练数据的现有APR方法不同,我们的方法在测试时利用鲁棒特征场隐式地约束几何信息。为增强NeFeS网络的鲁棒性,我们引入了特征融合模块和渐进式训练策略。所提方法在室内外基准数据集上将最先进的单图像APR准确率提升了高达54.9%,且无需额外耗时的无标注数据训练。