Multi-agent reinforcement learning (MARL) has enjoyed significant recent progress, thanks to deep learning. This is naturally starting to benefit multi-robot systems (MRS) in the form of multi-robot RL (MRRL). However, existing infrastructure to train and evaluate policies predominantly focus on challenges in coordinating virtual agents, and ignore characteristics important to robotic systems. Few platforms support realistic robot dynamics, and fewer still can evaluate Sim2Real performance of learned behavior. To address these issues, we contribute MARBLER: Multi-Agent RL Benchmark and Learning Environment for the Robotarium. MARBLER offers a robust and comprehensive evaluation platform for MRRL by marrying Georgia Tech's Robotarium (which enables rapid prototyping on physical MRS) and OpenAI's Gym framework (which facilitates standardized use of modern learning algorithms). MARBLER offers a highly controllable environment with realistic dynamics, including barrier certificate-based obstacle avoidance. It allows anyone across the world to train and deploy MRRL algorithms on a physical testbed with reproducibility. Further, we introduce five novel scenarios inspired by common challenges in MRS and provide support for new custom scenarios. Finally, we use MARBLER to evaluate popular MARL algorithms and provide insights into their suitability for MRRL. In summary, MARBLER can be a valuable tool to the MRS research community by facilitating comprehensive and standardized evaluation of learning algorithms on realistic simulations and physical hardware. Links to our open-source framework and the videos of real-world experiments can be found at https://shubhlohiya.github.io/MARBLER/.
翻译:多智能体强化学习(MARL)近年来得益于深度学习取得了显著进展,这自然地开始以多机器人强化学习(MRRL)的形式惠及多机器人系统(MRS)。然而,现有的策略训练与评估基础设施主要关注虚拟智能体协调挑战,忽略了机器人系统的重要特性。少数平台支持真实的机器人动力学,能够评估所学行为的Sim2Real性能的平台更是少之又少。为应对这些问题,我们贡献了MARBLER:面向Robotarium的多智能体强化学习基准与学习环境。MARBLER通过结合佐治亚理工学院的Robotarium(支持物理MRS快速原型验证)和OpenAI的Gym框架(促进现代学习算法的标准化使用),为MRRL提供了稳健且全面的评估平台。MARBLER提供具有真实动力学特性的高可控环境,包括基于屏障证书的避障功能,允许全球任何用户以可复现的方式在物理试验台上训练和部署MRRL算法。此外,我们引入了五个受MRS常见挑战启发的全新场景,并支持自定义新场景。最后,我们利用MARBLER评估了主流MARL算法,并深入分析了它们对MRRL的适用性。总之,MARBLER通过促进在逼真仿真与物理硬件上对学习算法进行全面且标准化的评估,有望成为MRS研究界的宝贵工具。开源框架链接及其实验视频可在https://shubhlohiya.github.io/MARBLER/ 获取。