Multi-agent reinforcement learning (MARL) has achieved promising results in recent years. However, most existing reinforcement learning methods require a large amount of data for model training. In addition, data-efficient reinforcement learning requires the construction of strong inductive biases, which are ignored in the current MARL approaches. Inspired by the symmetry phenomenon in multi-agent systems, this paper proposes a framework for exploiting prior knowledge by integrating data augmentation and a well-designed consistency loss into the existing MARL methods. In addition, the proposed framework is model-agnostic and can be applied to most of the current MARL algorithms. Experimental tests on multiple challenging tasks demonstrate the effectiveness of the proposed framework. Moreover, the proposed framework is applied to a physical multi-robot testbed to show its superiority.
翻译:多智能体强化学习近年来取得了令人瞩目的成果。然而,大多数现有的强化学习方法需要大量数据用于模型训练。此外,数据高效的强化学习需要构建强大的归纳偏置,而这在当前的多智能体强化学习方法中被忽视了。受多智能体系统中对称现象的启发,本文提出了一种通过将数据增强和精心设计的一致性损失集成到现有MARL方法中来利用先验知识的框架。同时,该框架具有模型无关性,可应用于当前大多数MARL算法。在多个具有挑战性的任务上的实验测试证明了该框架的有效性。此外,该框架被应用于物理多机器人实验平台以展示其优越性。