We study emergent communication in a multi-agent reinforcement learning setting, where the agents solve cooperative tasks and have access to a communication channel. The communication channel may consist of either discrete symbols or continuous variables. We introduce an inductive bias to aid with the emergence of good communication protocols for continuous messages, and we look at the effect this type of inductive bias has for continuous and discrete messages in itself or when used in combination with reinforcement learning. We demonstrate that this type of inductive bias has a beneficial effect on the communication protocols learnt in two toy environments, Negotiation and Sequence Guess.
翻译:我们在多智能体强化学习场景中研究涌现通信,其中智能体需协作完成任务并具备通信通道。该通信通道可包含离散符号或连续变量。我们引入一种归纳偏置以促进连续消息场景下良好通信协议的形成,并探究此类归纳偏置对连续与离散消息自身的影响,以及其与强化学习结合时的作用。实验表明,在协商(Negotiation)与序列猜测(Sequence Guess)这两个玩具环境中,这种归纳偏置对所习得的通信协议具有正向促进作用。