The sim-to-real gap poses a significant challenge in RL-based multi-agent exploration due to scene quantization and action discretization. Existing platforms suffer from the inefficiency in sampling and the lack of diversity in Multi-Agent Reinforcement Learning (MARL) algorithms across different scenarios, restraining their widespread applications. To fill these gaps, we propose MAexp, a generic platform for multi-agent exploration that integrates a broad range of state-of-the-art MARL algorithms and representative scenarios. Moreover, we employ point clouds to represent our exploration scenarios, leading to high-fidelity environment mapping and a sampling speed approximately 40 times faster than existing platforms. Furthermore, equipped with an attention-based Multi-Agent Target Generator and a Single-Agent Motion Planner, MAexp can work with arbitrary numbers of agents and accommodate various types of robots. Extensive experiments are conducted to establish the first benchmark featuring several high-performance MARL algorithms across typical scenarios for robots with continuous actions, which highlights the distinct strengths of each algorithm in different scenarios.
翻译:由于场景量化和动作离散化,从仿真到现实的差距对基于强化学习的多智能体探索构成了重大挑战。现有平台存在采样效率低下以及多智能体强化学习算法在不同场景中缺乏多样性的问题,限制了其广泛应用。为填补这些空白,我们提出了MAexp,一个集成多种最先进多智能体强化学习算法和代表性场景的通用多智能体探索平台。此外,我们采用点云来表示探索场景,实现了高保真度的环境映射,且采样速度比现有平台快约40倍。进一步,通过配备基于注意力的多智能体目标生成器与单智能体运动规划器,MAexp可适用于任意数量的智能体并兼容多种类型的机器人。我们进行了大量实验,针对连续动作机器人建立了首个涵盖典型场景下多种高性能多智能体强化学习算法的基准测试,凸显了各算法在不同场景中的独特优势。