This paper presents an end-to-end LLM-based agentic exploration system for an indoor shopping task, evaluated in both Gazebo simulation and a corresponding real-world corridor layout. The robot incrementally builds a lightweight semantic map by detecting signboards at junctions and storing direction-to-POI relations together with estimated junction poses, while AprilTags provide repeatable anchors for approach and alignment. Given a natural-language shopping request, an LLM produces a constrained discrete action at each junction (direction and whether to enter a store), and a ROS finite-state main controller executes the decision by gating modular motion primitives, including local-costmap-based obstacle avoidance, AprilTag approaching, store entry, and grasping. Qualitative results show that the integrated stack can perform end-to-end task execution from user instruction to multi-store navigation and object retrieval, while remaining modular and debuggable through its text-based map and logged decision history.
翻译:本文提出了一种端到端的基于LLM的智能探索系统,用于室内购物任务,并在Gazebo仿真环境及对应的真实世界走廊布局中进行了评估。机器人通过在路口检测标识牌,并结合估计的路口位姿存储方向-兴趣点关系,逐步构建轻量级语义地图;同时AprilTag为接近和对齐提供了可重复的定位锚点。给定自然语言购物请求,LLM在每个路口生成受限的离散动作(方向及是否进入商店),ROS有限状态主控制器通过门控模块化运动基元执行决策,这些基元包括基于局部代价地图的避障、AprilTag接近、商店进入及抓取。定性结果表明,该集成系统能够执行从用户指令到多商店导航及物体抓取的端到端任务,同时借助其基于文本的地图和记录决策历史保持模块化与可调试性。