The application of intelligent decision-making in unmanned aerial vehicle (UAV) is increasing, and with the development of UAV 1v1 pursuit-evasion game, multi-UAV cooperative game has emerged as a new challenge. This paper proposes a deep reinforcement learning-based model for decision-making in multi-role UAV cooperative pursuit-evasion game, to address the challenge of enabling UAV to autonomously make decisions in complex game environments. In order to enhance the training efficiency of the reinforcement learning algorithm in UAV pursuit-evasion game environment that has high-dimensional state-action space, this paper proposes multi-environment asynchronous double deep Q-network with priority experience replay algorithm to effectively train the UAV's game policy. Furthermore, aiming to improve cooperation ability and task completion efficiency, as well as minimize the cost of UAVs in the pursuit-evasion game, this paper focuses on the allocation of roles and targets within multi-UAV environment. The cooperative game decision model with varying numbers of UAVs are obtained by assigning diverse tasks and roles to the UAVs in different scenarios. The simulation results demonstrate that the proposed method enables autonomous decision-making of the UAVs in pursuit-evasion game scenarios and exhibits significant capabilities in cooperation.
翻译:智能决策在无人机领域的应用日益广泛,随着无人机1v1追逃博弈的发展,多无人机协同博弈已成为新的挑战。本文提出了一种基于深度强化学习的多角色无人机协同追逃博弈决策模型,以解决无人机在复杂博弈环境中自主决策的难题。针对无人机追逃博弈环境状态-动作空间维度高导致的强化学习算法训练效率问题,本文提出采用带优先经验回放的多环境异步双深度Q网络算法,有效训练无人机的博弈策略。此外,为提高协同能力与任务完成效率,并最小化追逃博弈中无人机的损耗,本文重点研究了多无人机环境下的角色与目标分配问题。通过对不同场景下的无人机赋予多样化任务与角色,获得了不同数量无人机下的协同博弈决策模型。仿真结果表明,所提方法能够实现无人机在追逃博弈场景中的自主决策,并展现出显著的协同能力。