We study the problem of generating arbitrarily large environments to improve the throughput of multi-robot systems. Prior work proposes Quality Diversity (QD) algorithms as an effective method for optimizing the environments of automated warehouses. However, these approaches optimize only relatively small environments, falling short when it comes to replicating real-world warehouse sizes. The challenge arises from the exponential increase in the search space as the environment size increases. Additionally, the previous methods have only been tested with up to 350 robots in simulations, while practical warehouses could host thousands of robots. In this paper, instead of optimizing environments, we propose to optimize Neural Cellular Automata (NCA) environment generators via QD algorithms. We train a collection of NCA generators with QD algorithms in small environments and then generate arbitrarily large environments from the generators at test time. We show that NCA environment generators maintain consistent, regularized patterns regardless of environment size, significantly enhancing the scalability of multi-robot systems in two different domains with up to 2,350 robots. Additionally, we demonstrate that our method scales a single-agent reinforcement learning policy to arbitrarily large environments with similar patterns. We include the source code at \url{https://github.com/lunjohnzhang/warehouse_env_gen_nca_public}.
翻译:本文研究了生成任意大规模环境以提升多机器人系统吞吐量的问题。现有工作提出质量多样性(QD)算法作为优化自动化仓库环境的有效方法,然而这些方法仅能优化相对较小的环境,在复现真实仓库规模时表现不足。其根本挑战在于搜索空间随环境规模指数级增长。此外,先前方法仅在模拟环境中测试了最多350个机器人,而实际仓库可能容纳数千台机器人。本文提出通过QD算法优化神经细胞自动机(NCA)环境生成器而非直接优化环境本身。我们在小型环境中利用QD算法训练NCA生成器集合,在测试阶段通过生成器构建任意规模环境。研究表明,无论环境规模如何变化,NCA环境生成器都能生成一致的规则化模式,在包含多达2350个机器人的两个不同领域中显著增强了多机器人系统的可扩展性。此外,我们证明该方法可将单智能体强化学习策略扩展到具有相似图案的任意规模环境中。相关源代码已开源至\url{https://github.com/lunjohnzhang/warehouse_env_gen_nca_public}。