Although the many efforts to apply deep reinforcement learning to query optimization in recent years, there remains room for improvement as query optimizers are complex entities that require hand-designed tuning of workloads and datasets. Recent research present learned query optimizations results mostly in bulks of single workloads which focus on picking up the unique traits of the specific workload. This proves to be problematic in scenarios where the different characteristics of multiple workloads and datasets are to be mixed and learned together. Henceforth, in this paper, we propose BitE, a novel ensemble learning model using database statistics and metadata to tune a learned query optimizer for enhancing performance. On the way, we introduce multiple revisions to solve several challenges: we extend the search space for the optimal Abstract SQL Plan(represented as a JSON object called ASP) by expanding hintsets, we steer the model away from the default plans that may be biased by configuring the experience with all unique plans of queries, and we deviate from the traditional loss functions and choose an alternative method to cope with underestimation and overestimation of reward. Our model achieves 19.6% more improved queries and 15.8% less regressed queries compared to the existing traditional methods whilst using a comparable level of resources.
翻译:尽管近年来已有大量研究尝试将深度强化学习应用于查询优化,但查询优化器作为需要人工调优工作负载和数据集的复杂实体,仍存在改进空间。现有研究多针对单一工作负载的特异性特征进行批量学习,这在需要混合学习多个工作负载与数据集不同特性的场景中存在问题。为此,本文提出BitE——一种利用数据库统计信息和元数据调优学习型查询优化器的集成学习模型,旨在提升性能。该模型通过多项改进解决若干挑战:通过扩展提示集扩大最优抽象SQL计划(表示为称为ASP的JSON对象)的搜索空间;通过将所有查询的唯一计划配置到经验中,引导模型远离可能产生偏差的默认计划;并采用替代传统损失函数的新方法处理奖励低估与高估问题。实验表明,与现有传统方法相比,本模型在保持资源消耗水平相当的情况下,实现19.6%的查询性能提升,并减少15.8%的查询性能退化。