Swarm-GPT: Combining Large Language Models with Safe Motion Planning for Robot Choreography Design

This paper presents Swarm-GPT, a system that integrates large language models (LLMs) with safe swarm motion planning - offering an automated and novel approach to deployable drone swarm choreography. Swarm-GPT enables users to automatically generate synchronized drone performances through natural language instructions. With an emphasis on safety and creativity, Swarm-GPT addresses a critical gap in the field of drone choreography by integrating the creative power of generative models with the effectiveness and safety of model-based planning algorithms. This goal is achieved by prompting the LLM to generate a unique set of waypoints based on extracted audio data. A trajectory planner processes these waypoints to guarantee collision-free and feasible motion. Results can be viewed in simulation prior to execution and modified through dynamic re-prompting. Sim-to-real transfer experiments demonstrate Swarm-GPT's ability to accurately replicate simulated drone trajectories, with a mean sim-to-real root mean square error (RMSE) of 28.7 mm. To date, Swarm-GPT has been successfully showcased at three live events, exemplifying safe real-world deployment of pre-trained models.

翻译：本文提出Swarm-GPT系统，该系统将大语言模型（LLMs）与安全集群运动规划相结合，为可部署无人机集群编舞提供了一种自动化且新颖的方法。Swarm-GPT使用户能够通过自然语言指令自动生成同步的无人机表演。在强调安全性与创造性的前提下，Swarm-GPT通过将生成模型的创意能力与基于模型的规划算法的有效性和安全性相结合，弥补了无人机编舞领域的关键空白。这一目标通过提示LLM基于提取的音频数据生成一组独特航路点来实现，轨迹规划器则对这些航路点进行处理以确保无碰撞且可行的运动。结果可在执行前通过仿真查看，并可通过动态重提示进行修改。仿真到真实世界的迁移实验表明，Swarm-GPT能够精确复现模拟无人机轨迹，其平均仿真到真实均方根误差（RMSE）为28.7毫米。迄今为止，Swarm-GPT已在三场现场活动中成功展示，例证了预训练模型在真实世界中的安全部署。

相关内容

大语言模型

关注 66

大语言模型是基于海量文本数据训练的深度学习模型。它不仅能够生成自然语言文本，还能够深入理解文本含义，处理各种自然语言任务，如文本摘要、问答、翻译等。2023年，大语言模型及其在人工智能领域的应用已成为全球科技研究的热点，其在规模上的增长尤为引人注目，参数量已从最初的十几亿跃升到如今的一万亿。参数量的提升使得模型能够更加精细地捕捉人类语言微妙之处，更加深入地理解人类语言的复杂性。在过去的一年里，大语言模型在吸纳新知识、分解复杂任务以及图文对齐等多方面都有显著提升。随着技术的不断成熟，它将不断拓展其应用范围，为人类提供更加智能化和个性化的服务，进一步改善人们的生活和生产方式。

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日