Handling the problem of scalability is one of the essential issues for multi-agent reinforcement learning (MARL) algorithms to be applied to real-world problems typically involving massively many agents. For this, parameter sharing across multiple agents has widely been used since it reduces the training time by decreasing the number of parameters and increasing the sample efficiency. However, using the same parameters across agents limits the representational capacity of the joint policy and consequently, the performance can be degraded in multi-agent tasks that require different behaviors for different agents. In this paper, we propose a simple method that adopts structured pruning for a deep neural network to increase the representational capacity of the joint policy without introducing additional parameters. We evaluate the proposed method on several benchmark tasks, and numerical results show that the proposed method significantly outperforms other parameter-sharing methods.
翻译:解决可扩展性问题是多智能体强化学习算法应用于包含大量智能体的真实世界场景的关键挑战之一。为此,跨智能体参数共享被广泛采用,该方法通过减少参数量与提升样本效率来缩短训练时间。然而,在所有智能体间共享相同参数会限制联合策略的表示能力,进而导致需要不同智能体展现差异化行为的多智能体任务性能下降。本文提出一种简洁方法,通过对深度神经网络实施结构化剪枝,在不引入额外参数的前提下增强联合策略的表示能力。我们在多个基准任务上评估该方法,数值结果表明,所提方法显著优于其他参数共享方法。