Navigating to destinations using human speech instructions is essential for autonomous mobile robots operating in the real world. Although robots can take different paths toward the same goal, the shortest path is not always optimal. A desired approach is to flexibly accommodate waypoint specifications, planning a better alternative path, even with detours. Furthermore, robots require real-time inference capabilities. Spatial representations include semantic, topological, and metric levels, each capturing different aspects of the environment. This study aims to realize a hierarchical spatial representation by a topometric semantic map and path planning with speech instructions, including waypoints. We propose SpCoTMHP, a hierarchical path-planning method that utilizes multimodal spatial concepts, incorporating place connectivity. This approach provides a novel integrated probabilistic generative model and fast approximate inference, with interaction among the hierarchy levels. A formulation based on control as probabilistic inference theoretically supports the proposed path planning. Navigation experiments using speech instruction with a waypoint demonstrated the performance improvement of path planning, WN-SPL by 0.589, and reduced computation time by 7.14 sec compared to conventional methods. Hierarchical spatial representations offer a mutually understandable form for humans and robots, enabling language-based navigation tasks.
翻译:通过人类语音指令导航至目标是自主移动机器人在现实世界中运行的关键能力。尽管机器人可针对同一目标选择不同路径,但最短路径并非总是最优方案。理想方法应能灵活适应路标点设定,规划出更优的替代路径(即便包含绕行),同时机器人需具备实时推理能力。空间表征包含语义、拓扑和度量三个层级,各自捕捉环境的不同特征。本研究旨在通过拓扑度量语义地图实现分层空间表征,并实现包含路标点的语音指令路径规划。我们提出SpCoTMHP方法,这是一种利用多模态空间概念(融合位置连通性)的分层路径规划方法。该方法创新性地构建了集成概率生成模型与快速近似推理机制,支持层级间的交互作用。基于控制作为概率推理的数学框架为所提路径规划提供了理论支撑。包含路标点语音指令的导航实验表明,与传统方法相比,路径规划性能指标WN-SPL提升了0.589,计算耗时减少了7.14秒。分层空间表征为人类与机器人提供了相互可理解的形式,从而实现了基于语言的导航任务。