Large language models (LLMs) can learn to perform a wide range of natural language tasks from just a handful of in-context examples. However, for generating strings from highly structured languages (e.g., semantic parsing to complex domain-specific languages), it is challenging for the LLM to generalize from just a few exemplars. We propose \emph{grammar prompting}, a simple approach to enable LLMs to use external knowledge and domain-specific constraints, expressed through a grammar in Backus--Naur Form (BNF), during in-context learning. Grammar prompting augments each demonstration example with a specialized grammar that is minimally sufficient for generating the particular output example, where the specialized grammar is a subset of the full DSL grammar. For inference, the LLM first predicts a BNF grammar given a test input, and then generates the output according to the rules of the grammar. Experiments demonstrate that grammar prompting can enable LLMs to perform competitively on a diverse set of DSL generation tasks, including semantic parsing (SMCalFlow, Overnight, GeoQuery), PDDL planning, and SMILES-based molecule generation.
翻译:大语言模型(LLMs)能够仅通过少量上下文示例学习执行多种自然语言任务。然而,对于生成高度结构化语言(例如语义解析到复杂的领域特定语言)的字符串,LLM很难仅通过几个示例进行泛化。我们提出**基于语法的提示**方法,这是一种简单的方法,使LLM能够在上下文学习过程中利用通过巴克斯-诺尔范式(BNF)语法表达的外部知识和领域特定约束。该方法为每个演示示例扩充一个专门语法,该语法是生成特定输出示例所需的最小充分子集,且完整DSL语法的子集。在推理阶段,LLM首先根据测试输入预测BNF语法,然后依据语法规则生成输出。实验表明,基于语法的提示方法能使LLM在多种DSL生成任务(包括语义解析SMCalFlow、Overnight、GeoQuery算法、PDDL规划以及基于SMILES的分子生成)中取得有竞争力的表现。