Recent advancements in robot control using large language models (LLMs) have demonstrated significant potential, primarily due to LLMs' capabilities to understand natural language commands and generate executable plans in various languages. However, in real-time and interactive applications involving mobile robots, particularly drones, the sequential token generation process inherent to LLMs introduces substantial latency, i.e. response time, in control plan generation. In this paper, we present a system called ChatFly that tackles this problem using a combination of a novel programming language called MiniSpec and its runtime to reduce the plan generation time and drone response time. That is, instead of asking an LLM to write a program (robotic plan) in the popular but verbose Python, ChatFly gets it to do it in MiniSpec specially designed for token efficiency and stream interpretation. Using a set of challenging drone tasks, we show that design choices made by ChatFly can reduce up to 62% response time and provide a more consistent user experience, enabling responsive and intelligent LLM-based drone control with efficient completion.
翻译:近期利用大型语言模型(LLM)进行机器人控制的研究取得了显著进展,这主要得益于LLM理解自然语言指令并生成多种语言可执行规划的能力。然而,在涉及移动机器人(尤其是无人机)的实时交互应用中,LLM固有的序列化令牌生成过程会导致控制规划生成产生显著延迟(即响应时间)。本文提出名为ChatFly的系统,通过结合新型编程语言MiniSpec及其运行时环境来解决该问题,从而缩短规划生成时间与无人机响应时间。具体而言,ChatFly并非要求LLM使用流行但冗长的Python编写程序(机器人规划),而是使其采用专为令牌效率与流式解释设计的MiniSpec语言进行编程。通过一系列具有挑战性的无人机任务实验,我们证明ChatFly的设计方案最高可降低62%的响应时间,提供更稳定的用户体验,从而实现响应灵敏、智能高效的基于LLM的无人机控制。