Recent advancements in large language models (LLMs) have enabled in-context learning (ICL)-based methods that significantly outperform fine-tuning approaches for text-to-SQL tasks. However, their performance is still considerably lower than that of human experts on benchmarks that include complex schemas and queries, such as BIRD. This study considers the sensitivity of LLMs to the prompts and introduces a novel approach that leverages multiple prompts to explore a broader search space for possible answers and effectively aggregate them. Specifically, we robustly refine the database schema through schema linking using multiple prompts. Thereafter, we generate various candidate SQL queries based on the refined schema and diverse prompts. Finally, the candidate queries are filtered based on their confidence scores, and the optimal query is obtained through a multiple-choice selection that is presented to the LLM. When evaluated on the BIRD and Spider benchmarks, the proposed method achieved execution accuracies of 65.5\% and 89.6\%, respectively, significantly outperforming previous ICL-based methods. Moreover, we established a new SOTA performance on the BIRD in terms of both the accuracy and efficiency of the generated queries.
翻译:近期大语言模型(LLMs)的进展使得基于上下文学习(ICL)的方法在文本到SQL任务中显著优于微调方法。然而,在包含复杂模式与查询的基准测试(如BIRD)上,其性能仍远低于人类专家。本研究考虑LLMs对提示的敏感性,提出一种利用多重提示探索更广候选答案搜索空间并有效聚合结果的新方法。具体而言,我们通过多重提示进行模式链接以鲁棒性优化数据库模式。随后基于优化后的模式与多样化提示生成多种候选SQL查询。最后,根据置信度分数筛选候选查询,并通过向LLM呈现多项选择来获得最优查询。在BIRD和Spider基准测试中,所提方法分别实现了65.5%和89.6%的执行准确率,显著优于此前基于ICL的方法。此外,我们在BIRD上就生成查询的准确率与效率两方面均建立了新的最优性能(SOTA)。