Large language models (LLMs) have been adopted for text-to-SQL tasks, utilizing their in-context learning (ICL) capability to translate natural language questions into SQL queries. However, such a technique faces correctness problems. In this paper, we conduct the first comprehensive study of text-to-SQL errors of ICL-based techniques. Our study covers four representative ICL-based techniques, five basic repairing methods, two benchmarks, and two LLM settings. We find that text-to-SQL errors are widespread and summarize 27 error types of 7 categories. We also find that existing repairing attempts have limited correctness improvement while having high computational overhead and many mis-repairs. Based on these findings, we propose MapleDoctor, a novel text-to-SQL error detection and repairing framework. The evaluation demonstrates that MapleDoctor outperforms existing solutions by repairing 13.8% more queries with a negligible number of mis-repairs and reducing 67.4% repair latency. The artifact is publicly available at GitHub.
翻译:大型语言模型(LLMs)已被用于文本到SQL任务,利用其上下文学习(ICL)能力将自然语言问题转化为SQL查询。然而,此类技术面临正确性问题。本文首次对基于ICL技术的文本到SQL错误进行全面研究。我们的研究涵盖四种代表性ICL技术、五种基本修复方法、两个基准测试及两种LLM设置。我们发现文本到SQL错误普遍存在,并归纳出7大类共27种错误类型。同时发现现有修复尝试在正确性提升方面有限,且存在高计算开销与大量误修复问题。基于这些发现,我们提出MapleDoctor——一种新型文本到SQL错误检测与修复框架。评估表明,MapleDoctor相比现有方案能多修复13.8%的查询,且误修复数量可忽略不计,同时将修复延迟降低67.4%。相关工件已在GitHub上公开发布。