Large Language Models (LLMs) exhibit impressive problem-solving skills across many tasks, but they still underperform compared to humans in various downstream applications, such as text-to-SQL. On the BIRD benchmark leaderboard, human performance achieves an accuracy of 92.96\%, whereas the top-performing method reaches only 72.39\%. Notably, these state-of-the-art (SoTA) methods predominantly rely on in-context learning to simulate human-like reasoning. However, they overlook a critical human skill: continual learning. Inspired by the educational practice of maintaining mistake notebooks during our formative years, we propose LPE-SQL (Leveraging Prior Experience: An Expandable Auxiliary Knowledge Base for Text-to-SQL), a novel framework designed to augment LLMs by enabling continual learning without requiring parameter fine-tuning. LPE-SQL consists of four modules that \textbf{i)} retrieve relevant entries, \textbf{ii)} efficient sql generation, \textbf{iii)} generate the final result through a cross-consistency mechanism and \textbf{iv)} log successful and failed tasks along with their reasoning processes or reflection-generated tips. Importantly, the core module of LPE-SQL is the fourth one, while the other modules employ foundational methods, allowing LPE-SQL to be easily integrated with SoTA technologies to further enhance performance. Our experimental results demonstrate that this continual learning approach yields substantial performance gains, with the smaller Llama-3.1-70B model with surpassing the performance of the larger Llama-3.1-405B model using SoTA methods.
翻译:大型语言模型(LLMs)在许多任务中展现出令人印象深刻的问题解决能力,但在文本到SQL等下游应用中,其表现仍不及人类。在BIRD基准测试排行榜上,人类准确率达到92.96%,而最佳方法仅达到72.39%。值得注意的是,这些最先进方法主要依赖上下文学习来模拟人类推理过程,却忽略了人类的一项关键能力:持续学习。受成长过程中记录错题本的教育实践启发,我们提出LPE-SQL(利用先验经验:一个可扩展的文本到SQL辅助知识库),这是一个无需参数微调即可实现持续学习的新型框架,旨在增强LLMs能力。LPE-SQL包含四个模块:\textbf{i)} 检索相关条目,\textbf{ii)} 高效SQL生成,\textbf{iii)} 通过交叉一致性机制生成最终结果,以及\textbf{iv)} 记录成功与失败任务及其推理过程或反思生成的提示。需要强调的是,LPE-SQL的核心模块是第四模块,其余模块采用基础方法,这使得LPE-SQL能够轻松集成最先进技术以进一步提升性能。实验结果表明,这种持续学习方法带来了显著的性能提升:采用LPE-SQL的较小规模Llama-3.1-70B模型,其表现超越了使用最先进方法的更大规模Llama-3.1-405B模型。