Discussion and debate among Large Language Models (LLMs) have gained considerable attention due to their potential to enhance the reasoning ability of LLMs. Although natural language is an obvious choice for communication due to LLM's language understanding capability, the token sampling step needed when generating natural language poses a potential risk of information loss, as it uses only one token to represent the model's belief across the entire vocabulary. In this paper, we introduce a communication regime named CIPHER (Communicative Inter-Model Protocol Through Embedding Representation) to address this issue. Specifically, we remove the token sampling step from LLMs and let them communicate their beliefs across the vocabulary through the expectation of the raw transformer output embeddings. Remarkably, by deviating from natural language, CIPHER offers an advantage of encoding a broader spectrum of information without any modification to the model weights. While the state-of-the-art LLM debate methods using natural language outperforms traditional inference by a margin of 1.5-8%, our experiment results show that CIPHER debate further extends this lead by 1-3.5% across five reasoning tasks and multiple open-source LLMs of varying sizes. This showcases the superiority and robustness of embeddings as an alternative "language" for communication among LLMs.
翻译:大型语言模型间的讨论与辩论因能增强其推理能力而备受关注。尽管自然语言凭借LLM的语言理解能力成为通信的直观选择,但生成自然语言时必须经过的令牌采样步骤存在潜在的信息损失风险——该步骤仅用单个令牌代表模型对整个词表的置信度。针对此问题,本文提出名为CIPHER(基于嵌入表示的模型间通信协议)的通信机制:具体而言,我们移除LLM中的令牌采样步骤,通过原始Transformer输出嵌入的期望值让模型跨词表传递置信度。值得注意的是,这种偏离自然语言的方法能够无损编码更广泛的信息,且无需修改模型权重。实验结果表明:在五项推理任务及多个不同规模的开源LLM上,当前使用自然语言的最优LLM辩论方法虽已比传统推理提升1.5-8%性能,但CIPHER辩论法进一步将优势扩大1-3.5%。这充分展现了嵌入作为LLM间通信替代"语言"的优越性与鲁棒性。