Cooperation is central to multi-agent reinforcement learning (MARL), yet learned coordination can be fragile when external perturbations disrupt inter-agent interactions. Prior robust MARL methods have primarily considered value-oriented attacks, leaving a gap in robustness when interaction structures themselves are corrupted. In this paper, we propose an interaction-breaking adversarial learning (IBAL) framework that takes an information-theoretic view to construct attacks that impede coordination by perturbing agents' observations and actions, and trains agents to perform reliably under such disruptions. Empirically, our approach improves robustness over existing robust MARL baselines across diverse attack settings and yields stronger performance even under agent-missing scenarios. Our code is available at https://sunwoolee0504.github.io/IBAL.
翻译:协作是多智能体强化学习(MARL)的核心,然而当外部扰动破坏智能体间交互时,习得的协调策略可能变得脆弱。现有鲁棒MARL方法主要考虑面向价值的攻击,在交互结构本身遭到破坏时存在鲁棒性空白。本文提出一种交互破坏式对抗学习(IBAL)框架,该框架从信息论视角构建攻击,通过扰动智能体的观测与行动来阻碍协调,并训练智能体在此类干扰下可靠执行。实验表明,我们的方法在多种攻击场景下均能提升相对于现有鲁棒MARL基线的鲁棒性,即使在智能体缺失场景下也能获得更优性能。我们的代码开源在 https://sunwoolee0504.github.io/IBAL。