Most existing parametric query optimization (PQO) techniques rely on traditional query optimizer cost models, which are often inaccurate and result in suboptimal query performance. We propose Kepler, an end-to-end learning-based approach to PQO that demonstrates significant speedups in query latency over a traditional query optimizer. Central to our method is Row Count Evolution (RCE), a novel plan generation algorithm based on perturbations in the sub-plan cardinality space. While previous approaches require accurate cost models, we bypass this requirement by evaluating candidate plans via actual execution data and training an ML model to predict the fastest plan given parameter binding values. Our models leverage recent advances in neural network uncertainty in order to robustly predict faster plans while avoiding regressions in query performance. Experimentally, we show that Kepler achieves significant improvements in query runtime on multiple datasets on PostgreSQL.
翻译:现有的大多数参数化查询优化(PQO)技术依赖传统查询优化器的代价模型,这些模型通常不准确,导致查询性能欠优。我们提出开普勒(Kepler),一种基于端到端学习的PQO方法,相比传统查询优化器在查询延迟上展现出显著加速。该方法的核心是行计数演化(RCE),一种基于子计划基数空间扰动的全新计划生成算法。以往方法需要精确的代价模型,而我们通过实际执行数据评估候选计划,并训练一个ML模型根据参数绑定值预测最快计划,从而绕过了这一需求。我们的模型利用神经网络不确定性方面的最新进展,以鲁棒地预测更快的计划,同时避免查询性能的回退。实验表明,在PostgreSQL上多个数据集中,开普勒在查询运行时间上取得了显著改进。