Natural Language to SQL (NL2SQL) has seen significant advancements with large language models (LLMs). However, these models often depend on closed-source systems and high computational resources, posing challenges in data privacy and deployment. In contrast, small language models (SLMs) struggle with NL2SQL tasks, exhibiting poor performance and incompatibility with existing frameworks. To address these issues, we introduce Feather-SQL, a new lightweight framework tailored for SLMs. Feather-SQL improves SQL executability and accuracy through 1) schema pruning and linking, 2) multi-path and multi-candidate generation. Additionally, we introduce the 1+1 Model Collaboration Paradigm, which pairs a strong general-purpose chat model with a fine-tuned SQL specialist, combining strong analytical reasoning with high-precision SQL generation. Experimental results on BIRD demonstrate that Feather-SQL improves NL2SQL performance on SLMs, with around 10% boost for models without fine-tuning. The proposed paradigm raises the accuracy ceiling of SLMs to 54.76%, highlighting its effectiveness.
翻译:自然语言转SQL(NL2SQL)技术已借助大语言模型(LLMs)取得了显著进展。然而,这些模型通常依赖于闭源系统和高计算资源,在数据隐私和部署方面带来了挑战。相比之下,小语言模型(SLMs)在处理NL2SQL任务时表现不佳,性能低下且与现有框架不兼容。为解决这些问题,我们提出了Feather-SQL,一个专为SLMs设计的新型轻量级框架。Feather-SQL通过1)模式剪枝与链接,以及2)多路径多候选生成,提升了SQL的可执行性与准确性。此外,我们引入了1+1模型协作范式,将一个强大的通用对话模型与一个经过微调的SQL专家模型配对,将强大的分析推理能力与高精度的SQL生成能力相结合。在BIRD基准上的实验结果表明,Feather-SQL提升了SLMs在NL2SQL任务上的性能,对于未经微调的模型,性能提升约10%。所提出的范式将SLMs的准确率上限提升至54.76%,凸显了其有效性。