Text-to-SQL semantic parsing has made significant progress in recent years, with various models demonstrating impressive performance on the challenging Spider benchmark. However, it has also been shown that these models often struggle to generalize even when faced with small perturbations of previously (accurately) parsed expressions. This is mainly due to the linguistic form of questions in Spider which are overly specific, unnatural, and display limited variation. In this work, we use data augmentation to enhance the robustness of text-to-SQL parsers against natural language variations. Existing approaches generate question reformulations either via models trained on Spider or only introduce local changes. In contrast, we leverage the capabilities of large language models to generate more realistic and diverse questions. Using only a few prompts, we achieve a two-fold increase in the number of questions in Spider. Training on this augmented dataset yields substantial improvements on a range of evaluation sets, including robustness benchmarks and out-of-domain data.
翻译:文本到SQL语义解析近年来取得了显著进展,各类模型在具有挑战性的Spider基准测试中展现出卓越性能。然而研究表明,即使面对先前已准确解析表达的微小扰动,这些模型仍难以实现有效泛化。这主要源于Spider中问题的语言形式过于特定、不自然且缺乏变体多样性。本研究通过数据增强技术提升文本到SQL解析器对自然语言变体的鲁棒性。现有方法要么借助Spider训练的模型生成问题改写,要么仅引入局部修改。与之相反,我们利用大语言模型的能力生成更真实且多样化的问题。仅通过少量提示词,我们就实现了Spider问题数量的两倍增长。基于该增强数据集的训练在包括鲁棒性基准测试和域外数据在内的多项评估集上均取得了显著改进。