Large language models (LLMs) can learn to perform a wide range of natural language tasks from just a handful of in-context examples. However, for generating strings from highly structured languages (e.g., semantic parsing to complex domain-specific languages), it is challenging for the LLM to generalize from just a few exemplars. We explore $\textbf{grammar prompting}$ as a simple approach for enabling LLMs to use external knowledge and domain-specific constraints, expressed through a grammar expressed in Backus--Naur Form (BNF), during in-context learning. Grammar prompting augments each demonstration example with a specialized grammar that is minimally sufficient for generating the particular output example, where the specialized grammar is a subset of the full DSL grammar. For inference, the LLM first predicts a BNF grammar given a test input, and then generates the output according to the rules of the grammar. Experiments demonstrate that grammar prompting can enable LLMs to perform competitively on a diverse set of DSL generation tasks, including semantic parsing (SMCalFlow, Overnight, GeoQuery), PDDL planning, and even molecule generation (SMILES).
翻译:大语言模型(LLMs)能够通过少量上下文示例学习执行广泛自然语言任务。然而,对于从高度结构化语言(如面向复杂领域特定语言的语义解析)生成字符串的任务,LLM难以仅凭少量示例实现泛化。我们探索了**语法提示**这一简单方法,使大语言模型在上下文学习过程中能够利用外部知识和领域特定约束——这些约束以巴科斯-诺尔范式(BNF)语法形式表达。语法提示为每个示范示例补充了针对该输出示例最小充分的专用语法(该专用语法是完整DSL语法的子集)。在推理阶段,LLM首先根据输入预测一个BNF语法,随后遵循该语法规则生成输出。实验表明,语法提示方法能使LLM在语义解析(SMCalFlow、Overnight、GeoQuery)、PDDL规划乃至分子生成(SMILES)等多种领域特定语言生成任务上取得具有竞争力的性能。