Large language models have demonstrated outstanding performance on a wide range of tasks such as question answering and code generation. On a high level, given an input, a language model can be used to automatically complete the sequence in a statistically-likely way. Based on this, users prompt these models with language instructions or examples, to implement a variety of downstream tasks. Advanced prompting methods can even imply interaction between the language model, a user, and external tools such as calculators. However, to obtain state-of-the-art performance or adapt language models for specific tasks, complex task- and model-specific programs have to be implemented, which may still require ad-hoc interaction. Based on this, we present the novel idea of Language Model Programming (LMP). LMP generalizes language model prompting from pure text prompts to an intuitive combination of text prompting and scripting. Additionally, LMP allows constraints to be specified over the language model output. This enables easy adaption to many tasks while abstracting language model internals and providing high-level semantics. To enable LMP, we implement LMQL(short for Language Model Query Language), which leverages the constraints and control flow from an LMP prompt to generate an efficient inference procedure that minimizes the number of expensive calls to the underlying language model. We show that LMQL can capture a wide range of state-of-the-art prompting methods in an intuitive way, especially facilitating interactive flows that are challenging to implement with existing high-level APIs. Our evaluation shows that we retain or increase the accuracy on several downstream tasks, while also significantly reducing the required amount of computation or cost in the case of pay-to-use APIs (26-85% cost savings).
翻译:大语言模型在问答、代码生成等广泛任务上展现出卓越性能。从高层次看,给定输入后,语言模型能以统计上合理的方式自动完成序列。基于此,用户通过语言指令或示例对模型进行提示,以执行各类下游任务。高级提示方法甚至能隐含语言模型、用户与计算器等外部工具之间的交互。然而,为获得最先进性能或针对特定任务适配语言模型,仍需实现复杂且依赖任务与模型的程序,这可能还涉及临时交互。基于此,我们提出语言模型编程(LMP)这一新颖概念。LMP将语言模型提示从纯文本提示推广为文本提示与脚本编写的直观组合。此外,LMP允许对语言模型输出指定约束,从而在抽象语言模型内部机制并提供高层语义的同时,轻松适配众多任务。为实现LMP,我们构建了LMQL(语言模型查询语言),该语言利用LMP提示中的约束与控制流,生成最小化对底层语言模型昂贵调用次数的高效推理过程。研究表明,LMQL能以直观方式捕获多种最先进的提示方法,尤其便于实现现有高层API难以实现的交互式流程。我们的评估显示,LMQL在多个下游任务上保持或提升准确率,同时显著降低计算量或付费API成本(节省26-85%成本)。