Recent advances in multi-agent reinforcement learning (MARL) have opened up vast application prospects, including swarm control of drones, collaborative manipulation by robotic arms, and multi-target encirclement. However, potential security threats during the MARL deployment need more attention and thorough investigation. Recent researches reveal that an attacker can rapidly exploit the victim's vulnerabilities and generate adversarial policies, leading to the victim's failure in specific tasks. For example, reducing the winning rate of a superhuman-level Go AI to around 20%. They predominantly focus on two-player competitive environments, assuming attackers possess complete global state observation. In this study, we unveil, for the first time, the capability of attackers to generate adversarial policies even when restricted to partial observations of the victims in multi-agent competitive environments. Specifically, we propose a novel black-box attack (SUB-PLAY), which incorporates the concept of constructing multiple subgames to mitigate the impact of partial observability and suggests the sharing of transitions among subpolicies to improve the exploitative ability of attackers. Extensive evaluations demonstrate the effectiveness of SUB-PLAY under three typical partial observability limitations. Visualization results indicate that adversarial policies induce significantly different activations of the victims' policy networks. Furthermore, we evaluate three potential defenses aimed at exploring ways to mitigate security threats posed by adversarial policies, providing constructive recommendations for deploying MARL in competitive environments.
翻译:近年来,多智能体强化学习(MARL)的进展开辟了广阔的应用前景,包括无人机集群控制、机械臂协同操作以及多目标包围等。然而,MARL部署过程中潜在的安全威胁需要更多关注与深入研究。近期研究表明,攻击者能够快速利用受害者的漏洞生成对抗性策略,导致其在特定任务中失败——例如,将超人类水平的围棋AI胜率降至约20%。这些研究主要聚焦于双人竞争环境,并假设攻击者拥有完整的全局状态观测能力。本研究首次揭示了在多人竞争环境中,即使攻击者仅能观测到受害者的部分状态,仍具备生成对抗性策略的能力。具体而言,我们提出了一种新型黑盒攻击方法(SUB-PLAY),该方法通过构建多个子博弈来缓解部分观测性的影响,并建议在子策略间共享转移信息以提升攻击者的剥削能力。大量实验验证了SUB-PLAY在三种典型部分观测限制条件下的有效性。可视化结果表明,对抗性策略会引发受害者策略网络显著不同的激活模式。此外,我们评估了三种潜在防御方法,旨在探索缓解对抗性策略安全威胁的途径,并为在竞争环境中部署MARL提供建设性建议。