Large language models have demonstrated outstanding performance on a wide range of tasks such as question answering and code generation. On a high level, given an input, a language model can be used to automatically complete the sequence in a statistically-likely way. Based on this, users prompt these models with language instructions or examples, to implement a variety of downstream tasks. Advanced prompting methods can even imply interaction between the language model, a user, and external tools such as calculators. However, to obtain state-of-the-art performance or adapt language models for specific tasks, complex task- and model-specific programs have to be implemented, which may still require ad-hoc interaction. Based on this, we present the novel idea of Language Model Programming (LMP). LMP generalizes language model prompting from pure text prompts to an intuitive combination of text prompting and scripting. Additionally, LMP allows constraints to be specified over the language model output. This enables easy adaption to many tasks while abstracting language model internals and providing high-level semantics. To enable LMP, we implement LMQL(short for Language Model Query Language), which leverages the constraints and control flow from an LMP prompt to generate an efficient inference procedure that minimizes the number of expensive calls to the underlying language model. We show that LMQL can capture a wide range of state-of-the-art prompting methods in an intuitive way, especially facilitating interactive flows that are challenging to implement with existing high-level APIs. Our evaluation shows that we retain or increase the accuracy on several downstream tasks, while also significantly reducing the required amount of computation or cost in the case of pay-to-use APIs (26-85% cost savings).
翻译:大型语言模型在问答和代码生成等广泛任务中展现出卓越性能。从高层次看,给定输入时,语言模型能以统计上最可能的方式自动完成序列。基于此,用户通过语言指令或示例对模型进行提示,从而实现多种下游任务。先进的提示方法甚至能隐含语言模型、用户与计算器等外部工具之间的交互。然而,为获得最先进性能或针对特定任务适配语言模型,仍需实现复杂的任务与模型专属程序,这往往需要临时交互。基于此,我们提出语言模型编程(LMP)这一创新思想。LMP将语言模型提示从纯文本提示泛化为文本提示与脚本的直观组合。此外,LMP允许对语言模型输出指定约束条件。这使其能够轻松适配多种任务,同时抽象语言模型内部机制并提供高层语义。为实现LMP,我们开发了LMQL(语言模型查询语言的简称),它利用LMP提示中的约束与控制流生成高效推理过程,最大限度减少对底层语言模型的昂贵调用次数。研究表明,LMQL能以直观方式涵盖多种最先进的提示方法,尤其便于实现现有高层API难以处理的交互式流程。实验评估显示,我们在多个下游任务中保持或提升了准确率,同时显著降低了计算量或按使用量付费API的成本(节省26-85%费用)。