Efficiently obtaining the up-to-date information in the disaster-stricken area is the key to successful disaster response. Unmanned aerial vehicles (UAVs), workers and cars can collaborate to accomplish sensing tasks, such as data collection, in disaster-stricken areas. In this paper, we explicitly address the route planning for a group of agents, including UAVs, workers, and cars, with the goal of maximizing the task completion rate. We propose MANF-RL-RP, a heterogeneous multi-agent route planning algorithm that incorporates several efficient designs, including global-local dual information processing and a tailored model structure for heterogeneous multi-agent systems. Global-local dual information processing encompasses the extraction and dissemination of spatial features from global information, as well as the partitioning and filtering of local information from individual agents. Regarding the construction of the model structure for heterogeneous multi-agent, we perform the following work. We design the same data structure to represent the states of different agents, prove the Markovian property of the decision-making process of agents to simplify the model structure, and also design a reasonable reward function to train the model. Finally, we conducted detailed experiments based on the rich simulation data. In comparison to the baseline algorithms, namely Greedy-SC-RP and MANF-DNN-RP, MANF-RL-RP has exhibited a significant improvement in terms of task completion rate.
翻译:高效获取灾区最新信息是成功应对灾害的关键。无人机、工人和车辆可协同完成数据采集等感知任务。本文明确研究了由无人机、工人和车辆组成的多智能体系统的路径规划问题,目标是最优化任务完成率。我们提出MANF-RL-RP——一种异构多智能体路径规划算法,其融合了全局-局部双信息处理、面向异构多智能体系统的定制化模型结构等高效设计。全局-局部双信息处理包含从全局信息中提取与传播空间特征,以及对各智能体局部信息进行划分与过滤。在异构多智能体模型结构构建方面,我们设计统一数据结构表征不同智能体状态,通过证明智能体决策过程的马尔可夫性简化模型结构,并设计合理奖励函数进行模型训练。最后,基于丰富的仿真数据开展详实实验。与基准算法Greedy-SC-RP和MANF-DNN-RP相比,MANF-RL-RP在任务完成率方面表现出显著提升。