Robust text-to-SQL over complex, real-world databases remains brittle even with modern LLMs: iterative refinement often introduces syntactic and semantic drift, corrections tend to be non-transferable across queries, and naive use of large context windows scales poorly. We propose a controlled text-to-SQL framework built around reflective refinement. Instead of repeatedly rewriting the current SQL instance, the system decomposes generation into typed stages and applies feedback as persistent updates to the stage-level generation mechanism. A Reflection-Refinement Loop localizes violations to the responsible stage maximize preservation of previously validated constraints and support monotonic improvement over a query set. The method operates without gold SQL by combining interpreter-based checks with LLM-based semantic coverage verification as epistemic judges. Experiments on Spider and BIRD demonstrate consistent gains over strong prompting baselines, robust convergence within a small refinement budget, and improved execution accuracy across both frontier and open-weight model families.
翻译:即使采用现代大型语言模型,针对复杂现实数据库的鲁棒性文本到SQL转换仍然脆弱:迭代优化常引入语法和语义漂移,修正方案往往难以跨查询迁移,且简单使用大上下文窗口的扩展性较差。我们提出一种围绕反思式优化的受控文本到SQL框架。该系统不再重复重写当前SQL实例,而是将生成过程分解为类型化阶段,并将反馈作为持久化更新应用于阶段级生成机制。反思-优化循环通过将违规定位至责任阶段,最大化保留先前已验证的约束,并支持在查询集上实现单调改进。该方法无需标准SQL标注,通过将基于解释器的检查与基于LLM的语义覆盖验证相结合,构建认知判断机制。在Spider和BIRD数据集上的实验表明,相较于强提示基线方法,本方案实现了稳定性能提升,在有限优化预算内达成鲁棒收敛,并在前沿模型与开源模型系列中均提升了执行准确率。