Cross-domain and cross-compositional generalization of Text-to-SQL semantic parsing is a challenging task. Existing Large Language Model (LLM) based solutions rely on inference-time retrieval of few-shot exemplars from the training set to synthesize a run-time prompt for each Natural Language (NL) test query. In contrast, we devise an algorithm which performs offline sampling of a minimal set-of few-shots from the training data, with complete coverage of SQL clauses, operators and functions, and maximal domain coverage within the allowed token length. This allows for synthesis of a fixed Generic Prompt (GP), with a diverse set-of exemplars common across NL test queries, avoiding expensive test time exemplar retrieval. We further auto-adapt the GP to the target database domain (DA-GP), to better handle cross-domain generalization; followed by a decomposed Least-To-Most-Prompting (LTMP-DA-GP) to handle cross-compositional generalization. The synthesis of LTMP-DA-GP is an offline task, to be performed one-time per new database with minimal human intervention. Our approach demonstrates superior performance on the KaggleDBQA dataset, designed to evaluate generalizability for the Text-to-SQL task. We further showcase consistent performance improvement of LTMP-DA-GP over GP, across LLMs and databases of KaggleDBQA, highlighting the efficacy and model agnostic benefits of our prompt based adapt and decompose approach.
翻译:跨领域与跨组合的文本到SQL语义解析泛化是一项具有挑战性的任务。现有基于大型语言模型(LLM)的方法依赖推理时从训练集中检索少量样本,为每条自然语言(NL)测试查询合成运行时提示。相比之下,我们设计了一种算法,从训练数据中离线采样最小样本集,该样本集完整覆盖SQL子句、操作符和函数,并在允许的token长度内实现最大领域覆盖。这允许合成固定通用提示(GP),其中包含跨NL测试查询通用的多样化样本集,从而避免昂贵的测试时样本检索。我们进一步将GP自动适应目标数据库领域(DA-GP),以更好地处理跨领域泛化;随后采用分解式从少到多提示(LTMP-DA-GP)来处理跨组合泛化。LTMP-DA-GP的合成是一项离线任务,每个新数据库仅需一次最少人工干预即可完成。我们的方法在KaggleDBQA数据集上展现了优越性能,该数据集专门设计用于评估文本到SQL任务的泛化能力。我们进一步展示了LTMP-DA-GP相对于GP在KaggleDBQA的不同LLM和数据库上持续的性能提升,凸显了基于提示的适应与分解方法的有效性和模型无关优势。