Enabling Ensemble Learning for Heterogeneous Large Language Models with Deep Parallel Collaboration

Large language models (LLMs) have shown complementary strengths in various tasks and instances, motivating the research of ensembling LLMs to push the frontier leveraging the wisdom of the crowd. Existing work achieves this objective via training the extra reward model or fusion model to select or fuse all candidate answers. However, these methods pose a great challenge to the generalizability of the trained models. Besides, existing methods use the textual responses as communication media, ignoring the rich information in the inner representations of neural networks. Therefore, we propose a training-free ensemble framework DEEPEN, averaging the probability distributions outputted by different LLMs. A key challenge in this paradigm is the vocabulary discrepancy between heterogeneous LLMs, which hinders the operation of probability distribution averaging. To address this challenge, DEEPEN maps the probability distribution of each model from the probability space to a universe relative space based on the relative representation theory, and performs aggregation. Then, the result of aggregation is mapped back to the probability space of one LLM via a search-based inverse transformation to determine the generated token. We conduct experiments on the ensemble of various LLMs of 6B to 70B. Experimental results show that DEEPEN achieves consistent improvements across six popular benchmarks involving subject examination, reasoning and knowledge-QA, proving the effectiveness of our approach.

翻译：摘要：大型语言模型（LLMs）在不同任务和实例中展现出互补优势，这促使研究者探索通过集成LLMs借助群体智慧推动技术前沿。现有工作通过训练额外的奖励模型或融合模型来筛选或融合候选答案实现该目标，但此类方法对已训练模型的泛化能力构成重大挑战。此外，现有方法以文本响应作为通信媒介，忽略了神经网络内部表征中蕴含的丰富信息。因此，我们提出一种无需训练的集成框架DEEPEN，通过对不同LLMs输出的概率分布进行平均化处理。该范式的核心挑战在于异构LLMs之间的词汇表差异阻碍了概率分布平均操作的实现。为应对此挑战，DEEPEN基于相对表征理论，将各模型输出的概率分布从概率空间映射到统一相对空间进行聚合，再通过基于搜索的逆变换将聚合结果映射回某个LLM的概率空间，从而确定生成词元。我们在参数规模从6B到70B的多种LLMs上开展集成实验，结果表明DEEPEN在涉及学科测验、推理和知识问答的六项主流基准测试中均取得一致性提升，验证了该方法的有效性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/