Multi-Agent Discussion (MAD) has garnered increasing attention very recently, where multiple LLM instances collaboratively solve problems via structured discussion. However, we find that current MAD methods easily suffer from discussion inconsistency, LLMs fail to reach a coherent solution, due to the misalignment between their individual contexts.In this paper, we introduce a multi-LLM context learning method (M2CL) that learns a context generator for each agent, capable of dynamically generating context instructions per discussion round via automatic information organization and refinement. Specifically, inspired by our theoretical insights on the context instruction, M2CL train the generators to control context coherence and output discrepancies via a carefully crafted self-adaptive mechanism.It enables LLMs to avoid premature convergence on majority noise and progressively reach the correct consensus. We evaluate M2CL on challenging tasks, including academic reasoning, embodied tasks, and mobile control. The results show that the performance of M2CL significantly surpasses existing methods by 20%--50%, while enjoying favorable transferability and computational efficiency.
翻译:多智能体讨论(MAD)近期受到越来越多的关注,其中多个大语言模型(LLM)实例通过结构化讨论协作解决问题。然而,我们发现当前的MAD方法容易遭受讨论不一致性的困扰,即由于各智能体个体上下文之间的错位,LLM无法达成一致的解决方案。本文提出了一种多LLM上下文学习方法(M2CL),该方法为每个智能体学习一个上下文生成器,能够通过自动信息组织与精炼,在每一轮讨论中动态生成上下文指令。具体而言,受我们关于上下文指令的理论见解启发,M2CL通过精心设计的自适应机制训练生成器,以控制上下文的连贯性和输出差异。这使得LLM能够避免过早收敛于多数噪声,并逐步达成正确的共识。我们在具有挑战性的任务上评估M2CL,包括学术推理、具身任务和移动控制。结果表明,M2CL的性能显著超越现有方法20%–50%,同时具有良好的可迁移性和计算效率。