We present a framework for intuitive robot programming by non-experts, leveraging natural language prompts and contextual information from the Robot Operating System (ROS). Our system integrates large language models (LLMs), enabling non-experts to articulate task requirements to the system through a chat interface. Key features of the framework include: integration of ROS with an AI agent connected to a plethora of open-source and commercial LLMs, automatic extraction of a behavior from the LLM output and execution of ROS actions/services, support for three behavior modes (sequence, behavior tree, state machine), imitation learning for adding new robot actions to the library of possible actions, and LLM reflection via human and environment feedback. Extensive experiments validate the framework, showcasing robustness, scalability, and versatility in diverse scenarios, including long-horizon tasks, tabletop rearrangements, and remote supervisory control. To facilitate the adoption of our framework and support the reproduction of our results, we have made our code open-source. You can access it at: https://github.com/huawei-noah/HEBO/tree/master/ROSLLM.
翻译:我们提出一个框架,旨在让非专家用户能够通过自然语言提示和来自机器人操作系统(ROS)的上下文信息,直观地对机器人进行编程。我们的系统集成了大语言模型(LLM),使非专家能够通过聊天界面向系统阐述任务需求。该框架的关键特性包括:将ROS与一个连接众多开源和商业LLM的AI智能体相集成;自动从LLM输出中提取行为并执行ROS动作/服务;支持三种行为模式(顺序执行、行为树、状态机);通过模仿学习向可能的动作库中添加新的机器人动作;以及通过人类和环境反馈实现LLM反思。大量实验验证了该框架的鲁棒性、可扩展性以及在多种场景下的通用性,包括长时程任务、桌面重排和远程监督控制。为促进我们框架的采用并支持我们结果的复现,我们已将代码开源。您可以通过以下链接访问:https://github.com/huawei-noah/HEBO/tree/master/ROSLLM。