Integrating Unmanned Aerial Vehicles (UAVs) with Unmanned Ground Vehicles (UGVs) provides an effective solution for persistent surveillance in disaster management. UAVs excel at covering large areas rapidly, but their range is limited by battery capacity. UGVs, though slower, can carry larger batteries for extended missions. By using UGVs as mobile recharging stations, UAVs can extend mission duration through periodic refueling, leveraging the complementary strengths of both systems. To optimize this energy-aware UAV-UGV cooperative routing problem, we propose a planning framework that determines optimal routes and recharging points between a UAV and a UGV. Our solution employs a deep reinforcement learning (DRL) framework built on an encoder-decoder transformer architecture with multi-head attention mechanisms. This architecture enables the model to sequentially select actions for visiting mission points and coordinating recharging rendezvous between the UAV and UGV. The DRL model is trained to minimize the age periods (the time gap between consecutive visits) of mission points, ensuring effective surveillance. We evaluate the framework across various problem sizes and distributions, comparing its performance against heuristic methods and an existing learning-based model. Results show that our approach consistently outperforms these baselines in both solution quality and runtime. Additionally, we demonstrate the DRL policy's applicability in a real-world disaster scenario as a case study and explore its potential for online mission planning to handle dynamic changes. Adapting the DRL policy for priority-driven surveillance highlights the model's generalizability for real-time disaster response.
翻译:将无人机与无人地面车相结合,为灾害管理中的持续监视提供了有效解决方案。无人机擅长快速覆盖广阔区域,但其续航能力受电池容量限制。无人地面车虽然速度较慢,但可携带更大容量的电池以执行长期任务。通过将无人地面车作为移动充电站,无人机可通过定期补给能源来延长任务时间,从而充分发挥两种系统的互补优势。为优化这一能量感知的无人机-地面车协同路径规划问题,我们提出了一种规划框架,用于确定无人机与地面车之间的最优路径和充电点。我们的解决方案采用基于编码器-解码器Transformer架构的深度强化学习框架,该架构配备多头注意力机制。该架构使模型能够按顺序选择访问任务点及协调无人机与地面车充电会合的动作。深度强化学习模型通过训练最小化任务点的"年龄周期"(连续访问之间的时间间隔),从而确保有效监视。我们在不同问题规模和分布场景下评估该框架,并将其性能与启发式方法及现有基于学习的模型进行比较。结果表明,我们的方法在解的质量和运行时间方面均持续优于这些基线方法。此外,我们通过案例研究展示了深度强化学习策略在真实灾害场景中的适用性,并探索了其在处理动态变化的在线任务规划方面的潜力。针对优先级驱动监视场景调整深度强化学习策略,进一步证明了该模型在实时灾害响应中的泛化能力。