With the advent of Large Language Models (LLMs), many database systems introduced semantic operators that enabled analytical queries over unstructured data (e.g. text, images, videos). Semantic operators typically incur high inference costs and latencies making semantic (AI) SQL queries challenging to apply on large scale datasets. At the same time, their semantic nature leads database engines to treat them as black boxes, making AISQL queries difficult to optimize. In this paper, we introduce Larch, a framework for optimizing the execution of semantic filters in AI SQL queries. Larch was inspired by two key observations: i) the high latency of semantic operators leaves significant room for computationally-heavy runtime optimization techniques, ii) unstructured data are typically accompanied by semantic information in the form of embeddings allowing for efficient semantic comparisons between AI_FILTER prompts and data values. Based on these two key observations, we present two Larch variants: Larch-A2C and Larch-Sel. Larch-A2C encodes arbitrary semantic filters expression tree using an embedding-augmented Gated Graph Neural Network and formulates the filter evaluation order as a Markov decision process. In contrast, Larch-Sel leverages a supervised learning model to predict filter selectivities, subsequently applying dynamic programming to find a near-optimal evaluation order for each input row. Evaluated across diverse real-world datasets and comprehensive synthetic workloads, both Larch variants always outperform existing semantic filter optimization techniques in terms of token usage. Our results demonstrate that Larch is robust across diverse workloads, reducing total token cost overhead by 3x-19x compared to Palimpzest and Quest.
翻译:摘要:随着大语言模型(LLM)的出现,许多数据库系统引入了语义运算符,从而支持对非结构化数据(如文本、图像、视频)的分析查询。语义运算符通常会产生高昂的推理成本和延迟,使得语义(AI)SQL查询难以应用于大规模数据集。同时,其语义特性导致数据库引擎将其视为黑盒,进一步增加了AISQL查询的优化难度。本文提出Larch框架,用于优化AI SQL查询中语义过滤器的执行。Larch的提出基于两个关键观察:(i)语义运算符的高延迟为计算密集型的运行时优化技术留出了充足空间;(ii)非结构化数据通常以嵌入向量的形式附带语义信息,使得AI_FILTER提示与数据值之间能够进行高效的语义比较。基于这两个关键观察,我们提出了两种Larch变体:Larch-A2C和Larch-Sel。Larch-A2C通过嵌入增强的门控图神经网络对任意语义过滤器表达式树进行编码,并将过滤器评估顺序建模为马尔可夫决策过程。相比之下,Larch-Sel利用监督学习模型预测过滤器选择度,随后采用动态规划为每行输入数据寻找近似最优的评估顺序。在多样化的真实数据集和综合合成工作负载上的评估表明,两种Larch变体在令牌使用量方面始终优于现有语义过滤器优化技术。我们的结果证明Larch在不同工作负载下具有鲁棒性,与Palimpzest和Quest相比,可将总令牌成本开销降低3倍至19倍。