Although the many efforts to apply deep reinforcement learning to query optimization in recent years, there remains room for improvement as query optimizers are complex entities that require hand-designed tuning of workloads and datasets. Recent research present learned query optimizations results mostly in bulks of single workloads which focus on picking up the unique traits of the specific workload. This proves to be problematic in scenarios where the different characteristics of multiple workloads and datasets are to be mixed and learned together. Henceforth, in this paper, we propose BitE, a novel ensemble learning model using database statistics and metadata to tune a learned query optimizer for enhancing performance. On the way, we introduce multiple revisions to solve several challenges: we extend the search space for the optimal Abstract SQL Plan(represented as a JSON object called ASP) by expanding hintsets, we steer the model away from the default plans that may be biased by configuring the experience with all unique plans of queries, and we deviate from the traditional loss functions and choose an alternative method to cope with underestimation and overestimation of reward. Our model achieves 19.6% more improved queries and 15.8% less regressed queries compared to the existing traditional methods whilst using a comparable level of resources.
翻译:尽管近年来有许多将深度强化学习应用于查询优化的努力,但由于查询优化器是复杂实体,需要手动设计工作负载和数据集的调优,因此仍有改进空间。近期研究主要呈现针对单一工作负载进行批量学习式查询优化的结果,侧重于捕捉特定工作负载的独特特征。这在需要混合学习多种工作负载和数据集的不同特征时存在问题。为此,本文提出BitE——一种新颖的集成学习模型,利用数据库统计信息和元数据对学习型查询优化器进行调优以提升性能。过程中我们引入多项改进以解决若干挑战:通过扩展提示集来扩大最优抽象SQL计划(以名为ASP的JSON对象表示)的搜索空间;利用所有查询的唯一计划配置经验,引导模型远离可能存在偏见的默认计划;并摒弃传统损失函数,选择替代方法以应对奖励的低估与高估问题。与传统方法相比,我们的模型在资源消耗相当的情况下,实现了19.6%的查询改进率提升,同时回归查询减少了15.8%。