Integrating language models into robotic exploration frameworks improves performance in unmapped environments by providing the ability to reason over semantic groundings, contextual cues, and temporal states. The proposed method employs large language models (GPT-3.5 and Claude Haiku) to reason over these cues and express that reasoning in terms of natural language, which can be used to inform future states. We are motivated by the context of search-and-rescue applications where efficient exploration is critical. We find that by leveraging natural language, semantics, and tracking temporal states, the proposed method greatly reduces exploration path distance and further exposes the need for environment-dependent heuristics. Moreover, the method is highly robust to a variety of environments and noisy vision detections, as shown with a 100% success rate in a series of comprehensive experiments across three different environments conducted in a custom simulation pipeline operating in Unreal Engine.
翻译:将语言模型集成到机器人探索框架中,通过提供对语义基础、上下文线索和时间状态的推理能力,提升了在未知环境中的性能表现。所提出的方法采用大型语言模型(GPT-3.5与Claude Haiku)对这些线索进行推理,并以自然语言形式表达推理过程,该过程可用于预测未来状态。我们的研究动机源于搜索救援应用场景,其中高效探索至关重要。研究发现,通过利用自然语言、语义信息及时间状态追踪,所提出的方法显著缩短了探索路径距离,并进一步揭示了环境依赖启发式规则的必要性。此外,该方法对多样化环境和噪声视觉检测具有高度鲁棒性——通过在Unreal Engine中运行的自定义仿真流水线,在三种不同环境中进行的系列综合实验实现了100%的成功率。