Multi-agent reinforcement learning (MARL) has enjoyed significant recent progress, thanks to deep learning. This is naturally starting to benefit multi-robot systems (MRS) in the form of multi-robot RL (MRRL). However, existing infrastructure to train and evaluate policies predominantly focus on challenges in coordinating virtual agents, and ignore characteristics important to robotic systems. Few platforms support realistic robot dynamics, and fewer still can evaluate Sim2Real performance of learned behavior. To address these issues, we contribute MARBLER: Multi-Agent RL Benchmark and Learning Environment for the Robotarium. MARBLER offers a robust and comprehensive evaluation platform for MRRL by marrying Georgia Tech's Robotarium (which enables rapid prototyping on physical MRS) and OpenAI's Gym framework (which facilitates standardized use of modern learning algorithms). MARBLER offers a highly controllable environment with realistic dynamics, including barrier certificate-based obstacle avoidance. It allows anyone across the world to train and deploy MRRL algorithms on a physical testbed with reproducibility. Further, we introduce five novel scenarios inspired by common challenges in MRS and provide support for new custom scenarios. Finally, we use MARBLER to evaluate popular MARL algorithms and provide insights into their suitability for MRRL. In summary, MARBLER can be a valuable tool to the MRS research community by facilitating comprehensive and standardized evaluation of learning algorithms on realistic simulations and physical hardware. Links to our open-source framework and the videos of real-world experiments can be found at https://shubhlohiya.github.io/MARBLER/.
翻译:多智能体强化学习(MARL)近年来因深度学习的发展取得了显著进展,这自然开始以多机器人强化学习(MRRL)的形式惠及多机器人系统(MRS)。然而,现有的用于训练和评估策略的基础设施主要关注虚拟智能体协调中的挑战,而忽略了机器人系统的重要特性。仅有少数平台支持真实的机器人动力学,更少有平台能够评估学习行为的仿真到现实(Sim2Real)性能。为解决这些问题,我们提出了MARBLER:面向Robotarium的多智能体RL基准测试与学习环境。MARBLER通过融合佐治亚理工学院的Robotarium(支持物理MRS的快速原型开发)与OpenAI的Gym框架(便于标准化使用现代学习算法),为MRRL提供了一个稳健且全面的评估平台。MARBLER提供具有真实动力学特性的高度可控环境,包括基于屏障证书的避障功能。它允许全球任何人在可复现的条件下在物理测试平台上训练和部署MRRL算法。此外,我们引入了受MRS常见挑战启发的五个新颖场景,并支持自定义新场景。最后,我们利用MARBLER评估了流行的MARL算法,并分析了其适用于MRRL的潜力。总之,MARBLER通过促进在真实仿真和物理硬件上对学习算法进行标准化全面评估,可成为MRS研究社区的重要工具。我们的开源框架及真实世界实验视频链接请见https://shubhlohiya.github.io/MARBLER/。