Enabling robots to autonomously discover high-level spatial concepts (e.g., rooms and walls) from primitive geometric observations (e.g., planar surfaces) within 3D Scene Graphs is essential for robust indoor navigation and mapping. These graphs provide a hierarchical metric-semantic representation in which such concepts are organized. To further enhance graph-SLAM performance, Factorized 3D Scene Graphs incorporate these concepts as optimization factors that constrain relative geometry and enforce global consistency. However, both stages of this process remain largely manual: concepts are typically derived using hand-crafted, concept-specific heuristics, while factors and their covariances are likewise manually designed. This reliance on manual specification limits generalization across diverse environments and scalability to new concept classes. This paper presents a novel learning-based method that infers spatial concepts online from observed vertical planes and introduces them as optimizable factors within a SLAM backend, eliminating the need to handcraft concept generation, factor design, and covariance specification. We evaluate our approach in simulated environments with complex layouts, improving room detection by 20.7% and trajectory estimation by 19.2%, and further validate it on real construction sites, where room detection improves by 5.3% and map matching accuracy by 3.8%. Results confirm that learned factors can improve their handcrafted counterparts in SLAM systems and serve as a foundation for extending this approach to new spatial concepts.
翻译:使机器人能够从三维场景图内的原始几何观测(例如平面表面)中自主发现高层空间概念(例如房间和墙壁),对于鲁棒的室内导航与建图至关重要。这些图提供了层级化的度量-语义表示,此类概念在其中被组织起来。为进一步提升图SLAM性能,因子化三维场景图将这些概念作为优化因子纳入,以约束相对几何关系并增强全局一致性。然而,该过程的两个阶段在很大程度上仍依赖于人工操作:概念通常使用手工设计的、针对特定概念的启发式方法推导得出,而因子及其协方差同样由人工设计。这种对手工指定的依赖限制了方法在不同环境间的泛化能力以及向新概念类别的可扩展性。本文提出了一种新颖的基于学习的方法,该方法能够在线地从观测到的垂直平面推断空间概念,并将其作为可优化因子引入SLAM后端,从而无需手工设计概念生成、因子构造及协方差指定。我们在具有复杂布局的仿真环境中评估了所提方法,房间检测性能提升了20.7%,轨迹估计精度提升了19.2%,并进一步在真实建筑工地上进行了验证,其中房间检测性能提升了5.3%,地图匹配精度提升了3.8%。结果证实,学习得到的因子能够改进SLAM系统中其手工设计的对应部分,并为将该方法扩展至新的空间概念奠定了基础。