Text-to-SQL aims to automate the process of generating SQL queries on a database from natural language text. In this work, we propose "SQLPrompt", tailored to improve the few-shot prompting capabilities of Text-to-SQL for Large Language Models (LLMs). Our methods include innovative prompt design, execution-based consistency decoding strategy which selects the SQL with the most consistent execution outcome among other SQL proposals, and a method that aims to improve performance by diversifying the SQL proposals during consistency selection with different prompt designs ("MixPrompt") and foundation models ("MixLLMs"). We show that \emph{SQLPrompt} outperforms previous approaches for in-context learning with few labeled data by a large margin, closing the gap with finetuning state-of-the-art with thousands of labeled data.
翻译:文本到SQL旨在从自然语言文本自动生成数据库上的SQL查询。本文提出"SQLPrompt",专门用于提升大型语言模型在文本到SQL任务中的少样本提示能力。我们的方法包括创新性的提示设计、基于执行结果的一致性解码策略(该策略从多个SQL候选方案中选择执行结果最一致的SQL),以及通过不同提示设计("MixPrompt")和基础模型("MixLLMs")在一致性选择过程中增加SQL候选方案多样性以提升性能的方法。实验表明,SQLPrompt在少量标注数据的上下文学习任务中大幅超越先前方法,缩小了与使用数千条标注数据进行微调的最先进方法之间的差距。