Recent advances in large language models (LLMs) demonstrate substantial capabilities in natural language understanding and generation tasks. With the growing number of LLMs, how to harness the collective expertise of multiple LLMs is an exciting open direction. Toward this goal, we propose a new approach that leverages the collective strengths of multiple LLMs through a Mixture-of-Agents (MoA) methodology. In our approach, we construct a layered MoA architecture wherein each layer comprises multiple LLM agents. Each agent takes all the outputs from agents in the previous layer as auxiliary information in generating its response. MoA models achieves state-of-art performance on AlpacaEval 2.0, MT-Bench and FLASK, surpassing GPT-4 Omni. For example, our MoA using only open-source LLMs is the leader of AlpacaEval 2.0 by a substantial gap, achieving a score of 65.1% compared to 57.5% by GPT-4 Omni.
翻译:近期大型语言模型(LLMs)的进展展示了其在自然语言理解与生成任务中的显著能力。随着LLMs数量的增长,如何利用多个LLMs的集体专业知识成为一个令人兴奋的开放方向。为此,我们提出了一种新方法,通过混合智能体(Mixture-of-Agents, MoA)技术发挥多个LLMs的集体优势。在我们的方法中,构建了分层的MoA架构,其中每一层包含多个LLM智能体。每个智能体在生成响应时,将前一层所有智能体的输出作为辅助信息。MoA模型在AlpacaEval 2.0、MT-Bench和FLASK基准上取得了最先进性能,超越了GPT-4 Omni。例如,仅使用开源LLMs的MoA模型以显著差距位居AlpacaEval 2.0排行榜首位,得分65.1%,而GPT-4 Omni得分为57.5%。