Indexes can significantly improve search performance in relational databases. However, if the query workload changes frequently or new data updates occur continuously, it may not be worthwhile to build a conventional index upfront for query processing. Adaptive indexing is a technique in which an index gets built on the fly as a byproduct of query processing. In recent years, research in database indexing has taken a new direction where machine learning models are employed for the purpose of indexing. These indexes, known as learned indexes, can be more efficient compared to traditional indexes such as B+-tree in terms of memory footprints and query performance. However, a learned index has to be constructed upfront and requires training the model in advance, which becomes a challenge in dynamic situations when workload changes frequently. To the best of our knowledge, no learned indexes exist yet for adaptive indexing. We propose a novel learned approach for adaptive indexing. It is built on the fly as queries are submitted and utilizes learned models for indexing data. To enhance query performance, we employ a query workload prediction technique that makes future workload projection based on past workload data. We have evaluated our learned adaptive indexing approach against existing adaptive indexes for various query workloads. Our results show that our approach performs better than others in most cases, offering 1.2x - 5.6x improvement in query performance.
翻译:索引能够显著提升关系数据库中的查询性能。然而,若查询负载频繁变化或数据持续更新,为查询处理预先构建传统索引可能并不值得。自适应索引是一种在查询处理过程中动态构建索引的技术。近年来,数据库索引研究出现了新方向,即采用机器学习模型进行索引构建。这类被称为学习型索引的结构,在内存占用和查询性能方面可能比B+树等传统索引更高效。但学习型索引仍需预先构建并训练模型,这在负载频繁变化的动态场景中成为挑战。据我们所知,目前尚未出现适用于自适应索引的学习型索引。本文提出一种新颖的学习型自适应索引方法。该方法在提交查询时动态构建,并利用学习模型进行数据索引。为提升查询性能,我们采用基于历史负载数据的查询负载预测技术来预估未来负载。我们在多种查询负载下,将所提出的学习型自适应索引方法与现有自适应索引进行了对比评估。实验结果表明,该方法在多数情况下性能优于其他索引,查询性能提升达1.2至5.6倍。