Cooperative path planning for heterogeneous UAV swarms poses significant challenges for Multi-Agent Reinforcement Learning (MARL), particularly in handling asymmetric inter-agent dependencies and addressing the risks of sparse rewards and catastrophic forgetting during training. To address these issues, this paper proposes an attentive curriculum learning framework (AC-MASAC). The framework introduces a role-aware heterogeneous attention mechanism to explicitly model asymmetric dependencies. Moreover, a structured curriculum strategy is designed, integrating hierarchical knowledge transfer and stage-proportional experience replay to address the issues of sparse rewards and catastrophic forgetting. The proposed framework is validated on a custom multi-agent simulation platform, and the results show that our method has significant advantages over other advanced methods in terms of Success Rate, Formation Keeping Rate, and Success-weighted Mission Time. The code is available at \textcolor{red}{https://github.com/Wanhao-Liu/AC-MASAC}.
翻译:异构无人机集群的协同路径规划对多智能体强化学习提出了重大挑战,特别是在处理非对称的智能体间依赖关系以及应对训练过程中稀疏奖励和灾难性遗忘的风险方面。为解决这些问题,本文提出了一种注意力课程学习框架。该框架引入了一种角色感知的异构注意力机制,以显式建模非对称依赖关系。此外,设计了一种结构化的课程策略,整合了分层知识迁移和阶段比例经验回放,以应对稀疏奖励和灾难性遗忘问题。所提出的框架在一个定制的多智能体仿真平台上进行了验证,结果表明,在成功率、队形保持率和任务成功加权时间等指标上,我们的方法相较于其他先进方法具有显著优势。代码发布于 \textcolor{red}{https://github.com/Wanhao-Liu/AC-MASAC}。