While initial approaches to Structure-from-Motion (SfM) revolved around both global and incremental methods, most recent applications rely on incremental systems to estimate camera poses due to their superior robustness. Though there has been tremendous progress in SfM `front-ends' powered by deep models learned from data, the state-of-the-art (incremental) SfM pipelines still rely on classical SIFT features, developed in 2004. In this work, we investigate whether leveraging the developments in feature extraction and matching helps global SfM perform on par with the SOTA incremental SfM approach (COLMAP). To do so, we design a modular SfM framework that allows us to easily combine developments in different stages of the SfM pipeline. Our experiments show that while developments in deep-learning based two-view correspondence estimation do translate to improvements in point density for scenes reconstructed with global SfM, none of them outperform SIFT when comparing with incremental SfM results on a range of datasets. Our SfM system is designed from the ground up to leverage distributed computation, enabling us to parallelize computation on multiple machines and scale to large scenes.
翻译:尽管从运动恢复结构(Structure-from-Motion, SfM)的早期研究同时涉及全局法和增量法,但近期大多数应用因增量法具有更优的鲁棒性而依赖其估计相机位姿。尽管基于数据学习的深度模型在SfM“前端”领域取得了巨大进展,当前最先进的(增量式)SfM流水线仍依赖2004年提出的经典SIFT特征。本研究探讨是否可通过利用特征提取与匹配领域的最新进展,使全局SfM达到与当前最优增量式SfM方法(COLMAP)相当的性能。为此,我们设计了一个模块化SfM框架,可便捷地整合SfM流水线不同阶段的研发成果。实验表明:尽管基于深度学习的两视图对应估计方法确实能提升全局SfM重建场景的点云密度,但在多个数据集上与增量式SfM结果的对比中,所有深度方法均未超越SIFT。我们的SfM系统采用底层分布式计算设计,支持多机并行处理并扩展至大规模场景。