Emergent Language (EL) focuses on the emergence of communication among artificial agents. Although symbolic communication channels more closely mirror the discrete nature of human language, learning such protocols remains fundamentally difficult due to the non-differentiability of symbol sampling. Existing approaches typically rely on high-variance gradient estimators such as REINFORCE or on continuous relaxations such as Gumbel-Softmax, both of which suffer from limitations in training stability and scalability. Motivated by cognitive theories that emphasize intrapersonal processes preceding communication, we explore self-play as a substrate for language emergence prior to mutual interaction. We introduce Vector Quantized Emergent Language (VQEL), a novel architecture that incorporates vector quantization into the message generation process. VQEL enables agents to perform self-play using discrete internal representations derived from a learned codebook while preserving end-to-end differentiability. Moreover, the resulting vector-quantized codebook naturally induces a symbolic vocabulary that can be directly transferred and aligned during subsequent mutual play with other agents. Empirical results show that agents pretrained via VQEL self-play achieve more consistent symbol alignment and higher task success when later engaged in mutual interaction. These findings position self-play as a principled and effective mechanism for learning discrete communication protocols, addressing key optimization and representational challenges in emergent language systems.
翻译:涌现语言(EL)研究人工智能体之间通信行为的自发形成。尽管符号通信通道更贴近人类语言的离散本质,但由于符号采样的不可微分性,学习此类协议仍存在根本性困难。现有方法通常依赖于高方差梯度估计器(如REINFORCE)或连续松弛技术(如Gumbel-Softmax),这两者在训练稳定性和可扩展性方面均存在局限。受强调先于人际通信的内在认知过程的理论启发,我们探索在相互交互之前以自我博弈作为语言涌现的基础。本文提出向量量化涌现语言(VQEL),这是一种将向量量化融入消息生成过程的新型架构。VQEL使智能体能够利用从学习型码本导出的离散内部表征进行自我博弈,同时保持端到端的可微分性。此外,由此产生的向量量化码本自然催生出符号词汇表,可在后续与其他智能体的相互博弈中直接迁移并对齐。实验结果表明,通过VQEL自我博弈预训练的智能体在后续相互交互时,能实现更稳定的符号对齐和更高的任务成功率。这些发现确立了自我博弈作为学习离散通信协议的原则性有效机制,为涌现语言系统解决了关键的优化与表征难题。