Classical query optimization searches over algebraically equivalent plans that differ only in cost. This assumption breaks once LLM-backed operators enter the picture: their placement, ordering, and granularity jointly determine both dollar cost and answer quality, and the right choice among the alternatives is often revealed only at runtime. We formalize this setting as agentic query execution, a query execution paradigm in which agent-based planning is interleaved with execution, and agent workflow optimization becomes the analogue of classical query optimization. We then present EnumGRPO, a self-improving optimizer for this setting. During a learning stage, EnumGRPO enumerates query plans over decisions such as execution paradigm, operator type, operator placement, selectivity scope, and projection width, then distills quality-cost feedback into reusable planning heuristics via in-context reinforcement learning. Across four databases in SWAN, EnumGRPO achieves 35.4% execution accuracy at $0.011 per query in LLM-operator cost, a ~317x cost reduction over the hybrid query baseline with an 18% relative improvement in answer accuracy.
翻译:经典查询优化搜索的是仅代价不同的代数等价执行计划。然而,当引入基于大语言模型(LLM)的算子后,这一假设不再成立:这些算子的放置位置、执行顺序及粒度共同决定了货币成本与答案质量,而不同方案的正确选择往往仅在运行时才能显现。我们将此场景形式化为智能体化查询执行——一种将基于智能体的规划与执行交织的查询执行范式,其中智能体工作流优化成为经典查询优化的对应物。随后我们提出EnumGRPO,一种面向该场景的自我改进优化器。在学习阶段,EnumGRPO枚举涵盖执行范式、算子类型、算子放置位置、选择性范围及投影宽度等决策的查询计划,并通过情境强化学习将质量-代价反馈提炼为可复用的规划启发式规则。在SWAN的四个数据库上,EnumGRPO实现了35.4%的执行准确率,每次查询的LLM算子成本为0.011美元,相较于混合查询基线成本降低了约317倍,同时答案准确率相对提升了18%。