Multi-agent reinforcement learning (MARL) has enjoyed significant recent progress, thanks to deep learning. This is naturally starting to benefit multi-robot systems (MRS) in the form of multi-robot RL (MRRL). However, existing infrastructure to train and evaluate policies predominantly focus on challenges in coordinating virtual agents, and ignore characteristics important to robotic systems. Few platforms support realistic robot dynamics, and fewer still can evaluate Sim2Real performance of learned behavior. To address these issues, we contribute MARBLER: Multi-Agent RL Benchmark and Learning Environment for the Robotarium. MARBLER offers a robust and comprehensive evaluation platform for MRRL by marrying Georgia Tech's Robotarium (which enables rapid prototyping on physical MRS) and OpenAI's Gym framework (which facilitates standardized use of modern learning algorithms). MARBLER offers a highly controllable environment with realistic dynamics, including barrier certificate-based obstacle avoidance. It allows anyone across the world to train and deploy MRRL algorithms on a physical testbed with reproducibility. Further, we introduce five novel scenarios inspired by common challenges in MRS and provide support for new custom scenarios. Finally, we use MARBLER to evaluate popular MARL algorithms and provide insights into their suitability for MRRL. In summary, MARBLER can be a valuable tool to the MRS research community by facilitating comprehensive and standardized evaluation of learning algorithms on realistic simulations and physical hardware. Links to our open-source framework and the videos of real-world experiments can be found at https://shubhlohiya.github.io/MARBLER/.
翻译:多智能体强化学习(MARL)借助深度学习取得了显著进展,这自然开始以多机器人强化学习(MRRL)的形式惠及多机器人系统(MRS)。然而,现有的用于训练和评估策略的基础设施主要关注虚拟智能体协调中的挑战,忽略了机器人系统的重要特性。少数平台支持逼真的机器人动力学,而能评估学习行为的Sim2Real性能的平台更是凤毛麟角。为解决这些问题,我们提出了MARBLER:面向Robotarium的多智能体RL基准测试与学习环境。MARBLER通过结合佐治亚理工学院的Robotarium(该平台支持物理MRS的快速原型开发)与OpenAI的Gym框架(该框架便于标准化使用现代学习算法),为MRRL提供了稳健而全面的评估平台。MARBLER提供具有逼真动力学特性的高度可控环境,包括基于屏障证书的避障功能。它允许全球任何人以可重复的方式在物理测试平台上训练和部署MRRL算法。此外,我们引入了五个受MRS常见挑战启发的新场景,并支持定制新场景。最后,我们利用MARBLER评估了流行的MARL算法,并揭示了其对于MRRL的适用性。总之,MARBLER通过促进在逼真仿真和物理硬件上对学习算法进行标准化综合评估,可为MRS研究社区提供宝贵工具。我们的开源框架链接和真实世界实验视频可在https://shubhlohiya.github.io/MARBLER/ 获取。