Businesses increasingly rely on large language models (LLMs) to automate simple repetitive tasks instead of developing custom machine learning models. LLMs require few, if any, training examples and can be utilized by users without expertise in model development. However, this comes at the cost of substantially higher resource and energy consumption compared to smaller models, which often achieve similar predictive performance for simple tasks. In this paper, we present our vision for just-in-time model replacement (JITR), where, upon identifying a recurring task in calls to an LLM, the model is replaced transparently with a cheaper alternative that performs well for this specific task. JITR retains the ease of use and low development effort of LLMs, while saving significant cost and energy. We discuss the main challenges in realizing our vision regarding the identification of recurring tasks and the creation of a custom model. Specifically, we argue that model search and transfer learning will play a crucial role in JITR to efficiently identify and fine-tune models for a recurring task. Using our JITR prototype Poodle, we achieve significant savings for exemplary tasks.
翻译:企业日益依赖大语言模型来自动化简单的重复性任务,而非开发定制机器学习模型。大语言模型无需或仅需少量训练样本,且可供缺乏模型开发专业知识的用户使用。然而,相比于在简单任务上往往能取得相似预测性能的小型模型,此举需耗费显著更高的资源和能源。本文提出了即时模型替换的愿景:在识别出对大语言模型的调用中存在重复性任务时,将该模型透明地替换为对此特定任务性能更优、成本更低的替代方案。即时模型替换保留了大语言模型的易用性和低开发成本,同时显著节省了资源和能源。我们探讨了实现该愿景所面临的主要挑战,包括重复性任务的识别以及定制模型的创建。具体而言,我们认为模型搜索和迁移学习将在即时模型替换中发挥关键作用,以高效识别重复性任务并对其进行模型微调。通过即时模型替换原型系统Poodle,我们对示例性任务实现了显著的成本节约。