UTune: Towards Uncertainty-Aware Online Index Tuning

There have been a flurry of recent proposals on learned benefit estimators for index tuning. Although these learned estimators show promising improvement over what-if query optimizer calls in terms of the accuracy of estimated index benefit, they face significant limitations when applied to online index tuning, an arguably more common and more challenging scenario in real-world applications. There are two major challenges for learned index benefit estimators in online tuning: (1) limited amount of query execution feedback that can be used to train the models, and (2) constant coming of new unseen queries due to workload drifts. The combination of the two hinders the generalization capability of existing learned index benefit estimators. To overcome these challenges, we present UTune, an uncertainty-aware online index tuning framework that employs operator-level learned models with improved generalization over unseen queries. At the core of UTune is an uncertainty quantification mechanism that characterizes the inherent uncertainty of the operator-level learned models given limited online execution feedback. We further integrate uncertainty information into index selection and configuration enumeration, the key component of any index tuner, by developing a new variant of the classic $ε$-greedy search strategy with uncertainty-weighted index benefits. Experimental evaluation shows that UTune not only significantly improves the workload execution time compared to state-of-the-art online index tuners but also reduces the index exploration overhead, resulting in faster convergence when the workload is relatively stable.

翻译：近期涌现了大量关于索引调优的学习型收益估计器的研究提案。尽管这些学习型估计器在估计索引收益的准确性方面，相较于传统的假设性查询优化器调用显示出有前景的改进，但当它们应用于在线索引调优时却面临显著局限。在线索引调优在现实应用中无疑是更常见且更具挑战性的场景。学习型索引收益估计器在在线调优中面临两大主要挑战：(1) 可用于训练模型的查询执行反馈数据量有限；(2) 由于工作负载漂移，新的未见查询持续到来。这两者的结合阻碍了现有学习型索引收益估计器的泛化能力。为克服这些挑战，我们提出了UTune，一个不确定性感知的在线索引调优框架，它采用算子级学习模型，并提升了对未见查询的泛化能力。UTune的核心是一个不确定性量化机制，该机制能在给定有限在线执行反馈的情况下，刻画算子级学习模型固有的不确定性。我们进一步将不确定性信息整合到索引选择和配置枚举（任何索引调优器的关键组件）中，具体方法是开发了一种经典$ε$-贪婪搜索策略的新变体，该策略使用不确定性加权的索引收益。实验评估表明，与最先进的在线索引调优器相比，UTune不仅显著改善了工作负载执行时间，还减少了索引探索开销，从而在工作负载相对稳定时实现更快的收敛。