The design of environments plays a critical role in shaping the development and evaluation of cooperative multi-agent reinforcement learning (MARL) algorithms. While existing benchmarks highlight critical challenges, they often lack the modularity required to design custom evaluation scenarios. We introduce the Totally Accelerated Battle Simulator in JAX (TABX), a high-throughput sandbox designed for reconfigurable multi-agent tasks. TABX provides granular control over environmental parameters, permitting a systematic investigation into emergent agent behaviors and algorithmic trade-offs across a diverse spectrum of task complexities. Leveraging JAX for hardware-accelerated execution on GPUs, TABX enables massive parallelization and significantly reduces computational overhead. By providing a fast, extensible, and easily customized framework, TABX facilitates the study of MARL agents in complex structured domains and serves as a scalable foundation for future research. Our code is available at: https://anonymous.4open.science/r/TABX-00CA.
翻译:环境设计在塑造协作式多智能体强化学习算法的开发与评估过程中起着至关重要的作用。尽管现有基准测试突显了关键挑战,但它们往往缺乏设计定制化评估场景所需的模块化特性。我们提出了基于JAX的完全加速战斗模拟器,这是一个专为可重构多智能体任务设计的高通量沙盒环境。TABX提供对环境参数的细粒度控制,允许在多样化的任务复杂度谱系中,对涌现的智能体行为与算法权衡进行系统性研究。通过利用JAX在GPU上的硬件加速执行能力,TABX实现了大规模并行化并显著降低了计算开销。通过提供快速、可扩展且易于定制的框架,TABX促进了复杂结构化领域中MARL智能体的研究,并为未来研究提供了可扩展的基础平台。我们的代码发布于:https://anonymous.4open.science/r/TABX-00CA。