The previous state-of-the-art (SOTA) method achieved a remarkable execution accuracy on the Spider dataset, which is one of the largest and most diverse datasets in the Text-to-SQL domain. However, during our reproduction of the business dataset, we observed a significant drop in performance. We examined the differences in dataset complexity, as well as the clarity of questions' intentions, and assessed how those differences could impact the performance of prompting methods. Subsequently, We develop a more adaptable and more general prompting method, involving mainly query rewriting and SQL boosting, which respectively transform vague information into exact and precise information and enhance the SQL itself by incorporating execution feedback and the query results from the database content. In order to prevent information gaps, we include the comments, value types, and value samples for columns as part of the database description in the prompt. Our experiments with Large Language Models (LLMs) illustrate the significant performance improvement on the business dataset and prove the substantial potential of our method. In terms of execution accuracy on the business dataset, the SOTA method scored 21.05, while our approach scored 65.79. As a result, our approach achieved a notable performance improvement even when using a less capable pre-trained language model. Last but not least, we also explore the Text-to-Python and Text-to-Function options, and we deeply analyze the pros and cons among them, offering valuable insights to the community.
翻译:先前基于Spider数据集(Text-to-SQL领域规模最大且最多样化的数据集之一)的最先进方法取得了显著的执行准确率。然而,在复现业务数据集时,我们观察到模型性能出现显著下降。通过考察数据集复杂度差异、问题意图清晰度差异,并评估这些因素对提示方法性能的影响,我们开发出一种更具适应性和通用性的提示方法。该方法主要包含查询重写与SQL增强两大模块:前者将模糊信息转化为精确信息,后者通过整合执行反馈与数据库内容查询结果来强化SQL本身。为避免信息缺失,我们在提示词中将数据库描述扩展为包含列注释、数据类型及样本值的完整描述。基于大语言模型的实验表明,该方法在业务数据集上实现了显著性能提升,充分验证了其应用潜力。在业务数据集执行准确率方面,最先进方法得分为21.05,而我们的方法达到65.79。值得注意的是,即使使用性能较弱的预训练语言模型,我们的方法仍展现出显著性能提升。最后,我们进一步探索了文本到Python与文本到函数方案,并深入分析各方案的优劣,为该领域研究提供了宝贵见解。