This paper presents the first end-to-end framework that combines guidance, navigation, and centralised task allocation for multiple UAVs performing autonomous search-and-rescue (SAR) in GNSS-denied indoor environments. A Twin Delayed Deep Deterministic Policy Gradient controller is trained with an Artificial Potential Field (APF) reward that blends attractive and repulsive potentials with continuous control, accelerating convergence and yielding smoother, safer trajectories than distance-only baselines. Collaborative mission assignment is solved by a deep Graph Attention Network that, at each decision step, reasons over the drone-task graph to produce near-optimal allocations with negligible on-board compute. To arrest the notorious Z-drift of indoor LiDAR-SLAM, we fuse depth-camera altimetry with IMU vertical velocity in a lightweight complementary filter, giving centimetre-level altitude stability without external beacons. The resulting system was deployed on two 1m-class quad-rotors and flight-tested in a cluttered, multi-level disaster mock-up designed for the NATO-Sapience Autonomous Cooperative Drone Competition. Compared with prior DRL guidance that remains largely in simulation, our framework demonstrates an ability to navigate complex indoor environments, securing first place in the 2024 event. These results demonstrate that APF-shaped DRL and GAT-driven cooperation can translate to reliable real-world SAR operations.
翻译:本文提出了首个端到端框架,将引导、导航与集中式任务分配相结合,用于多架无人机在GNSS拒止室内环境中执行自主搜救任务。我们采用融合吸引势与排斥势的人工势场奖励函数训练Twin Delayed Deep Deterministic Policy Gradient控制器,结合连续控制加速收敛,相比仅基于距离的基线方法能生成更平滑、更安全的轨迹。协作任务分配通过深度图注意力网络求解,该网络在每个决策步骤对无人机-任务图进行推理,以可忽略的机载计算量产生接近最优的分配方案。为抑制室内LiDAR-SLAM固有的Z轴漂移问题,我们通过轻量级互补滤波器融合深度相机测高数据与IMU垂直速度,在不依赖外部信标的情况下实现厘米级高度稳定性。该完整系统部署于两架1米级四旋翼无人机,并在为北约Sapience自主协作无人机竞赛设计的杂乱多层灾难模拟场景中进行飞行测试。与主要停留在仿真阶段的现有DRL引导方法相比,我们的框架展现出在复杂室内环境中的导航能力,荣获2024年赛事冠军。这些结果表明,APF塑造的DRL与GAT驱动的协作机制能够转化为可靠的实际搜救操作。