Unmanned Aerial Vehicle (UAV) swarms offer versatile applications in logistics, agriculture, and surveillance, yet controlling them requires expert knowledge for safety and feasibility. Traditional static methods limit adaptability, while Large Language Models (LLMs) enable natural language control but generate unsafe trajectories due to lacking physical grounding. This paper introduces SkySim, a ROS2-based simulation framework in Gazebo that decouples LLM high-level planning from low-level safety enforcement. Using Gemini 3.5 Pro, SkySim translates user commands (e.g., "Form a circle") into spatial waypoints, informed by real-time drone states. An Artificial Potential Field (APF) safety filter applies minimal adjustments for collision avoidance, kinematic limits, and geo-fencing, ensuring feasible execution at 20 Hz. Experiments with swarms of 3, 10, and 30 Crazyflie drones validate spatial reasoning accuracy (100% across tested geometric primitives), real-time collision prevention, and scalability. SkySim empowers non-experts to iteratively refine behaviors, bridging AI cognition with robotic safety for dynamic environments. Future work targets hardware integration.
翻译:无人机集群在物流、农业和监控等领域具有广泛的应用前景,但其控制需要专业知识以确保安全性和可行性。传统的静态方法限制了适应性,而大语言模型虽能实现自然语言控制,却因缺乏物理基础而生成不安全轨迹。本文提出SkySim,一个基于ROS2并在Gazebo中运行的仿真框架,它将大语言模型的高层规划与低层安全执行解耦。利用Gemini 3.5 Pro,SkySim将用户指令(例如“形成一个圆形”)转换为空间航点,并参考实时无人机状态。人工势场安全过滤器施加最小调整以实现碰撞避免、运动学限制和地理围栏,确保以20 Hz的频率可行执行。使用3、10和30架Crazyflie无人机集群进行的实验验证了空间推理准确性(在测试的几何基元上达到100%)、实时碰撞预防和可扩展性。SkySim使非专家能够迭代优化行为,在动态环境中弥合AI认知与机器人安全之间的鸿沟。未来工作旨在实现硬件集成。