CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL

In tackling the challenges of large language model (LLM) performance for Text-to-SQL tasks, we introduce CHASE-SQL, a new framework that employs innovative strategies, using test-time compute in multi-agent modeling to improve candidate generation and selection. CHASE-SQL leverages LLMs' intrinsic knowledge to generate diverse and high-quality SQL candidates using different LLM generators with: (1) a divide-and-conquer method that decomposes complex queries into manageable sub-queries in a single LLM call; (2) chain-of-thought reasoning based on query execution plans, reflecting the steps a database engine takes during execution; and (3) a unique instance-aware synthetic example generation technique, which offers specific few-shot demonstrations tailored to test questions.To identify the best candidate, a selection agent is employed to rank the candidates through pairwise comparisons with a fine-tuned binary-candidates selection LLM. This selection approach has been demonstrated to be more robust over alternatives. The proposed generators-selector framework not only enhances the quality and diversity of SQL queries but also outperforms previous methods. Overall, our proposed CHASE-SQL achieves the state-of-the-art execution accuracy of 73.0% and 73.01% on the test set and development set of the notable BIRD Text-to-SQL dataset benchmark, rendering CHASE-SQL the top submission of the leaderboard (at the time of paper submission).

翻译：为应对大型语言模型在文本到SQL任务中的性能挑战，我们提出了CHASE-SQL，这是一个新颖的框架，采用创新策略，通过在多智能体建模中利用测试时计算来改进候选生成与选择。CHASE-SQL利用LLM的固有知识，通过不同的LLM生成器生成多样且高质量的SQL候选，具体方法包括：(1) 一种分治法，可在单次LLM调用中将复杂查询分解为可管理的子查询；(2) 基于查询执行计划的思维链推理，反映了数据库引擎在执行过程中采取的步骤；(3) 一种独特的实例感知合成示例生成技术，可为测试问题提供量身定制的特定少样本示例。为识别最佳候选，系统采用一个选择智能体，通过微调的二元候选选择LLM进行成对比较来对候选进行排序。实践证明，这种选择方法比替代方案更为鲁棒。所提出的生成器-选择器框架不仅提升了SQL查询的质量与多样性，而且性能优于先前的方法。总体而言，我们提出的CHASE-SQL在著名的BIRD文本到SQL数据集基准的测试集和开发集上分别达到了73.0%和73.01%的最先进执行准确率，使CHASE-SQL成为该排行榜（截至论文提交时）的领先提交方案。

相关内容

大语言模型

关注 66

大语言模型是基于海量文本数据训练的深度学习模型。它不仅能够生成自然语言文本，还能够深入理解文本含义，处理各种自然语言任务，如文本摘要、问答、翻译等。2023年，大语言模型及其在人工智能领域的应用已成为全球科技研究的热点，其在规模上的增长尤为引人注目，参数量已从最初的十几亿跃升到如今的一万亿。参数量的提升使得模型能够更加精细地捕捉人类语言微妙之处，更加深入地理解人类语言的复杂性。在过去的一年里，大语言模型在吸纳新知识、分解复杂任务以及图文对齐等多方面都有显著提升。随着技术的不断成熟，它将不断拓展其应用范围，为人类提供更加智能化和个性化的服务，进一步改善人们的生活和生产方式。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日