Traditional query optimizers are designed to be fast and stateless: each query is quickly optimized using approximate statistics, sent off to the execution engine, and promptly forgotten. Recent work on learned query optimization have shown that it is possible for a query optimizer to "learn from its mistakes," correcting erroneous query plans the next time a plan is produced. But what if query optimizers could avoid mistakes entirely? This paper presents the idea of learned query superoptimization. A new generation of query superoptimizers could autonomously experiment to discover optimal plans using exploration-driven algorithms, iterative Bayesian optimization, and program synthesis. While such superoptimizers will take significantly longer to optimize a given query, superoptimizers have the potential to massively accelerate a large number of important repetitive queries being executed on data systems today.
翻译:传统查询优化器设计为快速且无状态:每个查询通过近似统计快速优化,发送至执行引擎后随即被遗忘。近期关于学习型查询优化的研究表明,查询优化器能够“从错误中学习”,在生成执行计划时修正错误方案。但如果查询优化器能够完全避免错误呢?本文提出学习型查询超优化的概念。新一代查询超优化器可通过探索驱动算法、迭代贝叶斯优化及程序合成,自主实验发现最优执行计划。尽管此类超优化器针对特定查询的优化耗时显著增长,但其具有潜力大幅加速当前数据系统执行的大量重要重复查询。