The use of artificial intelligence (AI) for drone control can have a transformative impact on drone capabilities, especially when real world information can be integrated with drone sensing, command, and control, part of a growing field of physical AI. Large language models (LLMs) can be advantageous if trained at scale on general knowledge, but especially and in particular when the training data includes information such as detailed map geography topology of the entire planet, as well as the ability to access real time situational data such as weather. However, challenges remain in the interface between drones and LLMs in general, with each application requiring a tedious, labor intensive effort to connect the LLM trained knowledge to drone command and control. Here, we solve that problem, using an interface strategy that is LLM agnostic and drone agnostic, providing the first universal, versatile, comprehensive and easy to use drone control interface. We do this using the new model context protocol (MCP) standard, an open standard that provides a universal way for AI systems to access external data, tools, and services. We develop and deploy a cloud based Linux machine hosting an MCP server that supports the Mavlink protocol, an ubiquitous drone control language used almost universally by millions of drones including Ardupilot and PX4 framework.We demonstrate flight control of a real unmanned aerial vehicle. In further testing, we demonstrate extensive flight planning and control capability in a simulated drone, integrated with a Google Maps MCP server for up to date, real time navigation information. This demonstrates a universal approach to integration of LLMs with drone command and control, a paradigm that leverages and exploits virtually all of modern AI industry with drone technology in an easy to use interface that translates natural language to drone control.
翻译:将人工智能(AI)应用于无人机控制可对无人机能力产生变革性影响,尤其是在现实世界信息能够与无人机感知、指挥和控制相结合时,这属于不断发展的物理AI领域的一部分。大型语言模型(LLMs)若能在通用知识上进行大规模训练将具有优势,特别是当训练数据包含诸如全球详细地图地理拓扑信息,以及能够访问天气等实时态势数据的能力时。然而,无人机与LLMs之间的接口普遍存在挑战,每个应用都需要耗费大量繁琐的人力劳动,将LLM训练所得知识连接到无人机指挥控制。在此,我们通过采用一种与LLM无关且与无人机无关的接口策略解决了该问题,提供了首个通用、多功能、全面且易于使用的无人机控制接口。我们利用新型模型上下文协议(MCP)标准实现此目标,该开放标准为AI系统访问外部数据、工具和服务提供了一种通用方式。我们开发并部署了一台基于云的Linux机器,其上运行着一个支持Mavlink协议的MCP服务器;Mavlink是一种无处不在的无人机控制语言,几乎被包括Ardupilot和PX4框架在内的数百万架无人机普遍使用。我们演示了对真实无人驾驶飞行器的飞行控制。在进一步测试中,我们展示了在模拟无人机中集成了谷歌地图MCP服务器以获取最新实时导航信息,从而实现的广泛飞行规划与控制能力。这展示了一种将LLMs与无人机指挥控制相集成的通用方法,该范式通过一个易于使用的接口,将自然语言转化为无人机控制指令,从而充分利用并整合了几乎整个现代AI产业与无人机技术。