UTune: Towards Uncertainty-Aware Online Index Tuning

There have been a flurry of recent proposals on learned benefit estimators for index tuning. Although these learned estimators show promising improvement over what-if query optimizer calls in terms of the accuracy of estimated index benefit, they face significant limitations when applied to online index tuning, an arguably more common and more challenging scenario in real-world applications. There are two major challenges for learned index benefit estimators in online tuning: (1) limited amount of query execution feedback that can be used to train the models, and (2) constant coming of new unseen queries due to workload drifts. The combination of the two hinders the generalization capability of existing learned index benefit estimators. To overcome these challenges, we present UTune, an uncertainty-aware online index tuning framework that employs operator-level learned models with improved generalization over unseen queries. At the core of UTune is an uncertainty quantification mechanism that characterizes the inherent uncertainty of the operator-level learned models given limited online execution feedback. We further integrate uncertainty information into index selection and configuration enumeration, the key component of any index tuner, by developing a new variant of the classic $ε$-greedy search strategy with uncertainty-weighted index benefits. Experimental evaluation shows that UTune not only significantly improves the workload execution time compared to state-of-the-art online index tuners but also reduces the index exploration overhead, resulting in faster convergence when the workload is relatively stable.

翻译：近期涌现了一系列关于索引调优的习得型收益评估器的研究提案。尽管这些习得型评估器在估计索引收益的准确性方面，相较于假设性查询优化器调用展现出显著的改进潜力，但在应用于在线索引调优时却面临重大局限——这在实际应用中无疑是更常见且更具挑战性的场景。习得型索引收益评估器在在线调优中面临两大主要挑战：(1) 可用于训练模型的查询执行反馈数据量有限；(2) 由于工作负载漂移，新的未见查询持续到来。这两者的结合阻碍了现有习得型索引收益评估器的泛化能力。为克服这些挑战，我们提出了UTune，一个不确定性感知的在线索引调优框架，它采用算子层级的习得模型，以提升对未见查询的泛化能力。UTune的核心是一个不确定性量化机制，该机制能在给定有限在线执行反馈的情况下，刻画算子层级习得模型的内在不确定性。我们进一步将不确定性信息整合到索引选择与配置枚举（任何索引调优器的关键组件）中，具体方法是开发一种经典$ε$-贪婪搜索策略的新变体，该策略使用不确定性加权的索引收益。实验评估表明，与最先进的在线索引调优器相比，UTune不仅显著改善了工作负载执行时间，而且降低了索引探索开销，从而在工作负载相对稳定时实现更快的收敛。