Recent work in database query optimization has used complex machine learning strategies, such as customized reinforcement learning schemes. Surprisingly, we show that LLM embeddings of query text contain useful semantic information for query optimization. Specifically, we show that a simple binary classifier deciding between alternative query plans, trained only on a small number of labeled embedded query vectors, can outperform existing heuristic systems. Although we only present some preliminary results, an LLM-powered query optimizer could provide significant benefits, both in terms of performance and simplicity.
翻译:近期数据库查询优化研究采用了复杂的机器学习策略,例如定制的强化学习方案。令人惊讶的是,我们发现查询文本的LLM嵌入包含了对查询优化有用的语义信息。具体而言,我们证明了一个简单的二元分类器——仅基于少量标记的嵌入查询向量进行训练,用于在备选查询计划之间进行决策——能够超越现有的启发式系统。尽管我们仅展示了一些初步结果,但LLM驱动的查询优化器可能在性能和简洁性方面带来显著优势。