Text-to-SQL systems often struggle with deep contextual understanding, particularly for complex queries with subtle requirements. We present PV-SQL, an agentic framework that addresses these failures through two complementary components: Probe and Verify. The Probe component iteratively generates probing queries to retrieve concrete records from the database, resolving ambiguities in value formats, column semantics, and inter-table relationships to build richer contextual understanding. The Verify component employs a rule-based method to extract verifiable conditions and construct an executable checklist, enabling iterative SQL refinement that effectively reduces missing constraints. Experiments on the BIRD benchmarks show that PV-SQL outperforms the best text-to-SQL baseline by 5% in execution accuracy and 20.8% in valid efficiency score while consuming fewer tokens.
翻译:文本到SQL系统常面临深层上下文理解的挑战,尤其对于包含微妙需求的复杂查询。我们提出PV-SQL——一个智能体框架,通过两个互补组件应对这些难题:探测组件与验证组件。探测组件通过迭代生成探查查询从数据库中检索具体记录,解决值格式、列语义及跨表关系中的歧义性,从而构建更丰富的上下文理解。验证组件采用基于规则的方法提取可验证条件并构建可执行检查清单,实现SQL的迭代优化以有效减少缺失约束。在BIRD基准上的实验表明,PV-SQL在执行准确率上超越最优文本到SQL基线5%,有效效率得分提升20.8%,同时消耗更少的词元。