Text-to-SQL, which enables natural language interaction with databases, serves as a pivotal method across diverse industries. With new, more powerful large language models (LLMs) emerging every few months, fine-tuning has become incredibly costly, labor-intensive, and error-prone. As an alternative, zero-shot Text-to-SQL, which leverages the growing knowledge and reasoning capabilities encoded in LLMs without task-specific fine-tuning, presents a promising and more challenging direction. To address this challenge, we propose Alpha-SQL, a novel approach that leverages a Monte Carlo Tree Search (MCTS) framework to iteratively infer SQL construction actions based on partial SQL query states. To enhance the framework's reasoning capabilities, we introduce LLM-as-Action-Model to dynamically generate SQL construction actions during the MCTS process, steering the search toward more promising SQL queries. Moreover, Alpha-SQL employs a self-supervised reward function to evaluate the quality of candidate SQL queries, ensuring more accurate and efficient query generation. Experimental results show that Alpha-SQL achieves 69.7% execution accuracy on the BIRD development set, using a 32B open-source LLM without fine-tuning. Alpha-SQL outperforms the best previous zero-shot approach based on GPT-4o by 2.5% on the BIRD development set.
翻译:文本到SQL技术,作为一种实现自然语言与数据库交互的关键方法,在众多行业中发挥着核心作用。随着每隔数月便出现更强大的大语言模型,微调方法已变得极其昂贵、劳动密集且容易出错。作为一种替代方案,零样本文本到SQL技术,利用LLM中日益增长的知识和推理能力,而无需进行任务特定的微调,展现出一个前景广阔且更具挑战性的研究方向。为应对这一挑战,我们提出了Alpha-SQL,这是一种新颖的方法,它利用蒙特卡洛树搜索框架,基于部分SQL查询状态迭代推断SQL构建动作。为了增强框架的推理能力,我们引入了LLM-as-Action-Model,在MCTS过程中动态生成SQL构建动作,引导搜索朝向更具潜力的SQL查询。此外,Alpha-SQL采用了一种自监督奖励函数来评估候选SQL查询的质量,确保生成更准确、更高效的查询。实验结果表明,Alpha-SQL在BIRD开发集上实现了69.7%的执行准确率,且使用的是未经微调的32B开源LLM。Alpha-SQL在BIRD开发集上的表现优于此前基于GPT-4o的最佳零样本方法2.5%。