Deep reinforcement learning (DRL) performance is generally impacted by state-adversarial attacks, a perturbation applied to an agent's observation. Most recent research has concentrated on robust single-agent reinforcement learning (SARL) algorithms against state-adversarial attacks. Still, there has yet to be much work on robust multi-agent reinforcement learning. Using QMIX, one of the popular cooperative multi-agent reinforcement algorithms, as an example, we discuss four techniques to improve the robustness of SARL algorithms and extend them to multi-agent scenarios. To increase the robustness of multi-agent reinforcement learning (MARL) algorithms, we train models using a variety of attacks in this research. We then test the models taught using the other attacks by subjecting them to the corresponding attacks throughout the training phase. In this way, we organize and summarize techniques for enhancing robustness when used with MARL.
翻译:深度强化学习(DRL)的性能通常受到状态对抗攻击的影响,这种攻击表现为对智能体观测施加的扰动。近期研究主要集中于提升单智能体强化学习(SARL)算法对抗状态攻击的鲁棒性,然而针对鲁棒多智能体强化学习的研究仍较为有限。本文以经典协作多智能体强化学习算法QMIX为例,探讨了四种提升SARL算法鲁棒性的技术,并将其扩展到多智能体场景中。为增强多智能体强化学习(MARL)算法的鲁棒性,我们在研究中采用多种攻击方式训练模型,随后通过使训练阶段模型遭受对应攻击,测试其经其他攻击方式训练后的表现。通过这一方法,我们系统梳理并总结了适用于MARL的鲁棒性提升技术方案。