We propose a new structure-from-motion framework to recover accurate camera poses and point clouds from unordered images. Traditional SfM systems typically rely on the successful detection of repeatable keypoints across multiple views as the first step, which is difficult for texture-poor scenes, and poor keypoint detection may break down the whole SfM system. We propose a new detector-free SfM framework to draw benefits from the recent success of detector-free matchers to avoid the early determination of keypoints, while solving the multi-view inconsistency issue of detector-free matchers. Specifically, our framework first reconstructs a coarse SfM model from quantized detector-free matches. Then, it refines the model by a novel iterative refinement pipeline, which iterates between an attention-based multi-view matching module to refine feature tracks and a geometry refinement module to improve the reconstruction accuracy. Experiments demonstrate that the proposed framework outperforms existing detector-based SfM systems on common benchmark datasets. We also collect a texture-poor SfM dataset to demonstrate the capability of our framework to reconstruct texture-poor scenes. Based on this framework, we take $\textit{first place}$ in Image Matching Challenge 2023.
翻译:我们提出了一种新的运动恢复结构框架,用于从无序图像中恢复精确的相机姿态和点云。传统SfM系统通常依赖第一步在多视图间成功检测可重复的关键点,这在纹理贫乏场景中较为困难,且关键点检测不佳可能导致整个SfM系统失效。我们提出一种新的无检测器SfM框架,利用近年来无检测器匹配器的成功,避免提前确定关键点,同时解决无检测器匹配器的多视图不一致问题。具体而言,该框架首先从量化后的无检测器匹配中重建一个粗略的SfM模型。随后,通过一种新颖的迭代优化流程对模型进行精化,该流程在基于注意力的多视图匹配模块(用于优化特征轨迹)与几何优化模块(用于提升重建精度)之间迭代进行。实验表明,所提框架在常用基准数据集上优于现有基于检测器的SfM系统。我们还收集了一个纹理贫乏的SfM数据集,以证明该框架重建此类场景的能力。基于此框架,我们在2023年图像匹配挑战赛中获得了$\textit{第一名}$。