Recent advances in Multi-Agent Reinforcement Learning have prompted the modeling of intricate interactions between agents in simulated environments. In particular, the predator-prey dynamics have captured substantial interest and various simulations been tailored to unique requirements. To prevent further time-intensive developments, we introduce Aquarium, a comprehensive Multi-Agent Reinforcement Learning environment for predator-prey interaction, enabling the study of emergent behavior. Aquarium is open source and offers a seamless integration of the PettingZoo framework, allowing a quick start with proven algorithm implementations. It features physics-based agent movement on a two-dimensional, edge-wrapping plane. The agent-environment interaction (observations, actions, rewards) and the environment settings (agent speed, prey reproduction, predator starvation, and others) are fully customizable. Besides a resource-efficient visualization, Aquarium supports to record video files, providing a visual comprehension of agent behavior. To demonstrate the environment's capabilities, we conduct preliminary studies which use PPO to train multiple prey agents to evade a predator. In accordance to the literature, we find Individual Learning to result in worse performance than Parameter Sharing, which significantly improves coordination and sample-efficiency.
翻译:近年来,多智能体强化学习的进展推动了模拟环境中智能体之间复杂交互的建模。其中,捕食者-猎物动态引起了广泛关注,且已有多种针对特定需求定制的仿真系统。为避免进一步耗费时间的开发工作,我们提出了水族馆(Aquarium),一个用于捕食者-猎物交互的综合多智能体强化学习环境,支持研究涌现行为。Aquarium是开源的,并与PettingZoo框架无缝集成,可快速启动已证实的算法实现。它采用基于物理的智能体移动机制,运行于二维边缘环绕平面上。智能体-环境交互(观测、动作、奖励)以及环境设置(智能体速度、猎物繁殖、捕食者饥饿等)均可完全定制。除资源高效的可视化功能外,Aquarium还支持录制视频文件,为智能体行为提供直观理解。为展示该环境的能力,我们开展了初步研究,使用PPO训练多个猎物智能体以躲避一个捕食者。与文献一致,我们发现个体学习的性能低于参数共享,后者显著提升了协调性和样本效率。