Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM

In conversational AI research, there's a noticeable trend towards developing models with a larger number of parameters, exemplified by models like ChatGPT. While these expansive models tend to generate increasingly better chat responses, they demand significant computational resources and memory. This study explores a pertinent question: Can a combination of smaller models collaboratively achieve comparable or enhanced performance relative to a singular large model? We introduce an approach termed "blending", a straightforward yet effective method of integrating multiple chat AIs. Our empirical evidence suggests that when specific smaller models are synergistically blended, they can potentially outperform or match the capabilities of much larger counterparts. For instance, integrating just three models of moderate size (6B/13B paramaeters) can rival or even surpass the performance metrics of a substantially larger model like ChatGPT (175B+ paramaters). This hypothesis is rigorously tested using A/B testing methodologies with a large user base on the Chai research platform over a span of thirty days. The findings underscore the potential of the "blending" strategy as a viable approach for enhancing chat AI efficacy without a corresponding surge in computational demands.

翻译：在对话式人工智能研究中，存在一个显著趋势——开发拥有更多参数量的模型，如ChatGPT等模型即是例证。尽管这类大规模模型倾向于生成越来越好的对话回复，但它们需要大量的计算资源和内存。本研究探讨了一个重要问题：多个较小模型的组合能否协同实现与单个大模型相当或更优的性能？我们提出了一种名为“融合”（blending）的方法，这是一种简单而有效的整合多个对话AI的技术。我们的实证证据表明，当特定较小模型被协同融合时，它们有可能超越或匹配更大规模模型的能力。例如，仅整合三个中等规模（60亿/130亿参数）的模型，就能与显著更大的模型（如ChatGPT，1750亿+参数）相抗衡甚至超越其性能指标。该假设通过A/B测试方法在Chai研究平台上对大量用户进行了为期三十天的严格验证。研究结果强调了“融合”策略作为一种可行方法的潜力，能够在无需相应增加计算需求的情况下提升对话AI的效率。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/