Traditional query optimizers are designed to be fast and stateless: each query is quickly optimized using approximate statistics, sent off to the execution engine, and promptly forgotten. Recent work on learned query optimization have shown that it is possible for a query optimizer to "learn from its mistakes," correcting erroneous query plans the next time a plan is produced. But what if query optimizers could avoid mistakes entirely? This paper presents the idea of learned query superoptimization. A new generation of query superoptimizers could autonomously experiment to discover optimal plans using exploration-driven algorithms, iterative Bayesian optimization, and program synthesis. While such superoptimizers will take significantly longer to optimize a given query, superoptimizers have the potential to massively accelerate a large number of important repetitive queries being executed on data systems today.
翻译:传统查询优化器设计为快速且无状态:每个查询通过近似统计量快速优化后,被发送至执行引擎,随即被遗忘。近期关于学习型查询优化的研究表明,查询优化器可通过"从错误中学习"来修正错误的查询计划——即在下次生成计划时予以纠正。但若查询优化器能够完全避免错误呢?本文提出学习型查询超优化的概念:新一代查询超优化器能够通过探索驱动算法、迭代贝叶斯优化及程序合成,自主实验以发现最优计划。尽管此类超优化器优化特定查询耗时显著更长,但其具备巨大潜力,可大幅加速当前数据系统中大量执行的重要重复性查询。