Many multi-agent systems require inter-agent communication to properly achieve their goal. By learning the communication protocol alongside the action protocol using multi-agent reinforcement learning techniques, the agents gain the flexibility to determine which information should be shared. However, when the number of agents increases we need to create an encoding of the information contained in these messages. In this paper, we investigate the effect of increasing the amount of information that should be contained in a message and increasing the number of agents. We evaluate these effects on two different message encoding methods, the mean message encoder and the attention message encoder. We perform our experiments on a matrix environment. Surprisingly, our results show that the mean message encoder consistently outperforms the attention message encoder. Therefore, we analyse the communication protocol used by the agents that use the mean message encoder and can conclude that the agents use a combination of an exponential and a logarithmic function in their communication policy to avoid the loss of important information after applying the mean message encoder.
翻译:许多多智能体系统需要智能体间的通信才能正确实现其目标。通过采用多智能体强化学习技术同时学习通信协议与动作协议,智能体能够灵活决定应共享哪些信息。然而,随着智能体数量的增加,我们需要对这些消息中包含的信息进行编码。本文研究了增加消息应包含的信息量以及增加智能体数量所产生的影响。我们针对两种不同的消息编码方法——均值消息编码器与注意力消息编码器,评估了这些影响,并在矩阵环境中开展实验。令人惊讶的是,实验结果表明均值消息编码器的性能始终优于注意力消息编码器。因此,我们分析了使用均值消息编码器的智能体所采用的通信协议,并得出结论:智能体在其通信策略中结合使用指数函数与对数函数,以避免在应用均值消息编码器后丢失重要信息。