Most existing parametric query optimization (PQO) techniques rely on traditional query optimizer cost models, which are often inaccurate and result in suboptimal query performance. We propose Kepler, an end-to-end learning-based approach to PQO that demonstrates significant speedups in query latency over a traditional query optimizer. Central to our method is Row Count Evolution (RCE), a novel plan generation algorithm based on perturbations in the sub-plan cardinality space. While previous approaches require accurate cost models, we bypass this requirement by evaluating candidate plans via actual execution data and training an ML model to predict the fastest plan given parameter binding values. Our models leverage recent advances in neural network uncertainty in order to robustly predict faster plans while avoiding regressions in query performance. Experimentally, we show that Kepler achieves significant improvements in query runtime on multiple datasets on PostgreSQL.
翻译:现有参数化查询优化(PQO)技术大多依赖传统查询优化器的代价模型,而这类模型常存在精度不足的问题,导致查询性能次优。我们提出Kepler——一种基于端到端学习的PQO方法,在查询延迟上相较传统查询优化器实现了显著加速。该方法的核心是行计数演化(RCE),一种基于子计划基数空间扰动的全新计划生成算法。不同于以往需要精确代价模型的方法,我们通过实际执行数据评估候选计划,并训练机器学习模型根据参数绑定值预测最优计划,从而绕开代价模型依赖。我们的模型利用神经网络不确定性领域的最新进展,在保证查询性能不倒退的前提下,鲁棒地预测出更优计划。实验表明,Kepler在PostgreSQL的多组数据集上实现了查询运行时的显著提升。