Many recent prompting strategies for large language models (LLMs) query the model multiple times sequentially -- first to produce intermediate results and then the final answer. However, using these methods, both decoder and model are unaware of potential follow-up prompts, leading to disconnected and undesirably wordy intermediate responses. In this work, we address this issue by proposing prompt sketching, a new prompting paradigm in which an LLM does not only respond by completing a prompt, but by predicting values for multiple variables in a template. This way, sketching grants users more control over the generation process, e.g., by providing a reasoning framework via intermediate instructions, leading to better overall results. The key idea enabling sketching with existing, autoregressive models is to adapt the decoding procedure to also score follow-up instructions during text generation, thus optimizing overall template likelihood in inference. Our experiments show that in a zero-shot setting, prompt sketching outperforms existing, sequential prompting schemes such as direct asking or chain-of-thought on 7 out of 8 LLM benchmarking tasks, including state tracking, arithmetic reasoning, and general question answering. To facilitate future use, we release a number of generic, yet effective sketches applicable to many tasks, and an open source library called dclib, powering our sketch-aware decoders.
翻译:近期许多针对大型语言模型(LLM)的提示策略采用顺序多次查询方式——首先生成中间结果,再产出最终答案。然而,这类方法中解码器与模型均无法预知后续提示,导致中间响应缺乏连贯性且冗余赘述。本研究提出"提示草图"(prompt sketching)这一新型提示范式以解决该问题:LLM不仅通过补全提示进行响应,还能预测模板中的多个变量值。由此,草图赋予用户对生成过程更强的控制力,例如通过中间指令提供推理框架,从而获得更优的整体结果。使草图方法与现有自回归模型兼容的关键在于:调整解码流程,在文本生成时同步评估后续指令,最终通过推理优化模板整体似然值。实验表明,在零样本场景下,提示草图在8项LLM基准测试中的7项任务(包括状态追踪、算术推理与通用问答)上均优于现有顺序提示方案(如直接提问或思维链)。为促进未来应用,我们发布了数个通用且高效的草图模板(适用于多种任务),并开源了支持草图感知解码器的工具库dclib。