Multi-agent reinforcement learning (MARL) has achieved great progress in cooperative tasks in recent years. However, in the local reward scheme, where only local rewards for each agent are given without global rewards shared by all the agents, traditional MARL algorithms lack sufficient consideration of agents' mutual influence. In cooperative tasks, agents' mutual influence is especially important since agents are supposed to coordinate to achieve better performance. In this paper, we propose a novel algorithm Mutual-Help-based MARL (MH-MARL) to instruct agents to help each other in order to promote cooperation. MH-MARL utilizes an expected action module to generate expected other agents' actions for each particular agent. Then, the expected actions are delivered to other agents for selective imitation during training. Experimental results show that MH-MARL improves the performance of MARL both in success rate and cumulative reward.
翻译:近年来,多智能体强化学习在合作任务中取得了显著进展。然而,在仅提供各智能体局部奖励而无全局共享奖励的局部奖励机制下,传统多智能体强化学习算法缺乏对智能体间相互影响的充分考虑。在合作任务中,智能体间的相互影响尤为重要,因为智能体需要协同以实现更优性能。本文提出一种新颖算法——基于互助的多智能体强化学习(MH-MARL),通过引导智能体相互帮助以促进合作。MH-MARL利用预期动作模块为每个特定智能体生成其他智能体的预期动作,随后在训练过程中将这些预期动作传递给其他智能体进行选择性模仿。实验结果表明,MH-MARL在成功率和累积奖励两方面均提升了多智能体强化学习的性能。