This work advances autonomous robot exploration by integrating agent-level semantic reasoning with fast local control. We introduce FARE, a hierarchical autonomous exploration framework that integrates a large language model (LLM) for global reasoning with a reinforcement learning (RL) policy for local decision making. FARE follows a fast-slow thinking paradigm. The slow-thinking LLM module interprets a concise textual description of the unknown environment and synthesizes an agent-level exploration strategy, which is then grounded into a sequence of global waypoints through a topological graph. To further improve reasoning efficiency, this module employs a modularity-based pruning mechanism that reduces redundant graph structures. The fast-thinking RL module executes exploration by reacting to local observations while being guided by the LLM-generated global waypoints. The RL policy is additionally shaped by a reward term that encourages adherence to the global waypoints, enabling coherent and robust closed-loop behavior. This architecture decouples semantic reasoning from geometric decision, allowing each module to operate in its appropriate temporal and spatial scale. In challenging simulated environments, our results show that FARE achieves substantial improvements in exploration efficiency over state-of-the-art baselines. We further deploy FARE on hardware and validate it in complex, large scale $200m\times130m$ building environment.
翻译:本研究通过将智能体层面的语义推理与快速局部控制相结合,推动了自主机器人探索技术的发展。我们提出FARE——一种分层自主探索框架,该框架将用于全局推理的大语言模型(LLM)与用于局部决策的强化学习(RL)策略相集成。FARE遵循快慢思维范式:慢思维LLM模块解析未知环境的简明文本描述,并综合生成智能体层面的探索策略,随后通过拓扑图将该策略具象化为一系列全局航点。为提升推理效率,该模块采用基于模块度的剪枝机制以减少冗余图结构。快思维RL模块则在LLM生成的全局航点引导下,根据局部观测执行探索任务。RL策略还通过奖励项进行塑形,以鼓励遵循全局航点,从而实现连贯且鲁棒的闭环行为。该架构将语义推理与几何决策解耦,使各模块能在适宜的时空尺度上运行。在具有挑战性的仿真环境中,实验结果表明FARE相比前沿基线方法在探索效率上取得显著提升。我们进一步将FARE部署于硬件平台,并在复杂的大规模(200m×130m)建筑环境中验证了其有效性。