The query optimizer is a fundamental component of database management systems. Recent studies have shown that learned query optimizers outperform traditional cost-based query optimizers. However, they fail to exploit valuable runtime observations generated during query execution to dynamically re-optimize the plan, thereby limiting further improvements in query performance. To address this issue, we propose learned query re-optimization, which allows optimization decisions to be deferred to execution time and guided by actual runtime observations. We realize this idea through LQRS, a learned query re-optimization framework that builds upon Spark SQL, exploiting runtime observations for dynamic plan refinement. Specifically, LQRS employs a curriculum reinforcement learning strategy and jointly supports pre-execution and in-execution optimization, allowing knowledge learned during execution to directly benefit pre-execution planning. Furthermore, we design a plug-and-play planner extension built upon the extensibility interfaces of Spark SQL, enabling online plan modification. Experiments on Spark SQL demonstrate that LQRS reduces end-to-end execution time by up to 90% compared to other learned query optimizers and query re-optimization methods.
翻译:查询优化器是数据库管理系统的核心组件。近期研究表明,基于学习的查询优化器性能优于传统的基于代价的查询优化器。然而,现有方法未能充分利用查询执行过程中产生的宝贵运行时观测信息进行动态计划重优化,从而限制了查询性能的进一步提升。为解决这一问题,我们提出学习式查询重优化方法,将优化决策延迟至执行阶段,并依据实际运行时观测进行动态指导。我们通过LQRS框架实现这一理念——该框架基于Spark SQL构建,利用运行时观测实现动态计划优化。具体而言,LQRS采用课程强化学习策略,同时支持执行前与执行中优化,使得执行阶段习得的知识能够直接提升执行前规划效果。此外,我们基于Spark SQL的可扩展接口设计了即插即用式规划器扩展模块,支持在线计划修改。在Spark SQL上的实验表明,相较于其他学习式查询优化器与查询重优化方法,LQRS最多可减少90%的端到端执行时间。