Efficient aerial data collection is important in many remote sensing applications. In large-scale monitoring scenarios, deploying a team of unmanned aerial vehicles (UAVs) offers improved spatial coverage and robustness against individual failures. However, a key challenge is cooperative path planning for the UAVs to efficiently achieve a joint mission goal. We propose a novel multi-agent informative path planning approach based on deep reinforcement learning for adaptive terrain monitoring scenarios using UAV teams. We introduce new network feature representations to effectively learn path planning in a 3D workspace. By leveraging a counterfactual baseline, our approach explicitly addresses credit assignment to learn cooperative behaviour. Our experimental evaluation shows improved planning performance, i.e. maps regions of interest more quickly, with respect to non-counterfactual variants. Results on synthetic and real-world data show that our approach has superior performance compared to state-of-the-art non-learning-based methods, while being transferable to varying team sizes and communication constraints.
翻译:高效空中数据采集在许多遥感应用中至关重要。在大规模监测场景中,部署多架无人机编队能够提升空间覆盖范围并增强对单点故障的鲁棒性。然而,关键挑战在于如何实现无人机间的协同路径规划以高效完成联合任务目标。我们针对采用无人机编队的自适应地形监测场景,提出了一种基于深度强化学习的创新多智能体信息导向路径规划方法。通过引入新型网络特征表示,本方法能够在三维工作空间中有效学习路径规划策略。借助反事实基线机制,我们显式解决了信用分配问题以学习协同行为。实验评估表明,相较于非反事实变体方法,本方法在规划性能上表现更优(即能更快地绘制感兴趣区域)。在合成数据与真实数据上的实验结果显示,本方法不仅相较最先进的非学习方法具有卓越性能,且能迁移至不同编队规模与通信约束场景。