Query rewrite transforms SQL queries into semantically equivalent forms that run more efficiently. Existing approaches mainly rely on predefined rewrite rules, but they handle a limited subset of queries and can cause performance regressions. This limitation stems from three challenges of rule-based query rewrite: (1) it is hard to discover and verify new rules, (2) fixed rewrite rules do not generalize to new query patterns, and (3) some rewrite techniques cannot be expressed as fixed rules. Motivated by the fact that human experts exhibit significantly better rewrite ability but suffer from scalability, and Large Language Models (LLMs) have demonstrated nearly human-level semantic and reasoning abilities, we propose a new approach of using LLMs to rewrite SQL queries beyond rules. Due to the hallucination problems in LLMs, directly applying LLMs often leads to nonequivalent and suboptimal queries. To address this issue, we propose QUITE (query rewrite), a training-free and feedback-aware system based on LLM agents that rewrites SQL queries into semantically equivalent forms with significantly better performance, covering a broader range of query patterns and rewrite strategies compared to rule-based methods. Firstly, we design a multi-agent framework controlled by a finite state machine (FSM) to equip LLMs with the ability to use external tools and enhance the rewrite process with real-time database feedback. Secondly, we develop a rewrite middleware to enhance the ability of LLMs to generate optimized query equivalents. Finally, we employ a novel hint injection technique to improve execution plans for rewritten queries. Extensive experiments show that QUITE reduces query execution time by up to 35.8% over state-of-the-art approaches and produces 24.1% more rewrites than prior methods, covering query cases that earlier systems did not handle.
翻译:查询重写将SQL查询转换为语义等价但执行效率更高的形式。现有方法主要依赖预定义的重写规则,但仅能处理有限的查询子集,且可能导致性能回退。这种局限性源于基于规则的查询重写面临的三个挑战:(1) 新规则的发现与验证困难;(2) 固定的重写规则难以泛化到新的查询模式;(3) 部分重写技术无法表达为固定规则。受人类专家展现出显著更强的重写能力但可扩展性不足,以及大语言模型(LLMs)已展现出接近人类水平的语义理解和推理能力这一事实的启发,我们提出了一种利用LLMs进行超越规则的SQL查询重写的新方法。由于LLMs存在幻觉问题,直接应用LLMs通常会导致非等价或次优的查询。为解决此问题,我们提出了QUITE(查询重写),这是一个基于LLM智能体的免训练且具备反馈感知能力的系统,能够将SQL查询重写为语义等价但性能显著更优的形式,与基于规则的方法相比,其覆盖了更广泛的查询模式和重写策略。首先,我们设计了一个由有限状态机(FSM)控制的多智能体框架,使LLMs具备使用外部工具的能力,并利用实时数据库反馈增强重写过程。其次,我们开发了重写中间件以增强LLMs生成优化查询等价形式的能力。最后,我们采用了一种新颖的提示注入技术来改进重写后查询的执行计划。大量实验表明,与最先进的方法相比,QUITE将查询执行时间降低了高达35.8%,并比先前方法多生成24.1%的重写结果,覆盖了早期系统未能处理的查询案例。