We leverage 3D Gaussian Splatting (3DGS) as a scene representation and propose a novel test-time camera pose refinement framework, GSLoc. This framework enhances the localization accuracy of state-of-the-art absolute pose regression and scene coordinate regression methods. The 3DGS model renders high-quality synthetic images and depth maps to facilitate the establishment of 2D-3D correspondences. GSLoc obviates the need for training feature extractors or descriptors by operating directly on RGB images, utilizing the 3D foundation model, MASt3R, for precise 2D matching. To improve the robustness of our model in challenging outdoor environments, we incorporate an exposure-adaptive module within the 3DGS framework. Consequently, GSLoc enables efficient one-shot pose refinement given a single RGB query and a coarse initial pose estimation. Our proposed approach surpasses leading NeRF-based optimization methods in both accuracy and runtime across indoor and outdoor visual localization benchmarks, achieving new state-of-the-art accuracy on two indoor datasets.
翻译:本研究利用3D高斯溅射(3DGS)作为场景表示方法,提出了一种新颖的测试时相机位姿优化框架GSLoc。该框架显著提升了当前最先进的绝对位姿回归与场景坐标回归方法的定位精度。3DGS模型能够渲染高质量的合成图像与深度图,从而有效促进2D-3D对应关系的建立。GSLoc无需训练特征提取器或描述符,可直接处理RGB图像,并利用3D基础模型MASt3R实现精确的2D匹配。为增强模型在复杂室外环境中的鲁棒性,我们在3DGS框架中引入了曝光自适应模块。因此,GSLoc能够在给定单张RGB查询图像和粗略初始位姿估计的情况下,实现高效的单次位姿优化。所提方法在室内外视觉定位基准测试中,其精度与运行效率均超越当前主流的基于神经辐射场的优化方法,并在两个室内数据集上取得了最新的最优精度。