Larch: Learned Query Optimization for Semantic Predicates

With the advent of Large Language Models (LLMs), many database systems introduced semantic operators that enabled analytical queries over unstructured data (e.g. text, images, videos). Semantic operators typically incur high inference costs and latencies making semantic (AI) SQL queries challenging to apply on large scale datasets. At the same time, their semantic nature leads database engines to treat them as black boxes, making AISQL queries difficult to optimize. In this paper, we introduce Larch, a framework for optimizing the execution of semantic filters in AI SQL queries. Larch was inspired by two key observations: i) the high latency of semantic operators leaves significant room for computationally-heavy runtime optimization techniques, ii) unstructured data are typically accompanied by semantic information in the form of embeddings allowing for efficient semantic comparisons between AI_FILTER prompts and data values. Based on these two key observations, we present two Larch variants: Larch-A2C and Larch-Sel. Larch-A2C encodes arbitrary semantic filters expression tree using an embedding-augmented Gated Graph Neural Network and formulates the filter evaluation order as a Markov decision process. In contrast, Larch-Sel leverages a supervised learning model to predict filter selectivities, subsequently applying dynamic programming to find a near-optimal evaluation order for each input row. Evaluated across diverse real-world datasets and comprehensive synthetic workloads, both Larch variants always outperform existing semantic filter optimization techniques in terms of token usage. Our results demonstrate that Larch is robust across diverse workloads, reducing total token cost overhead by 3x-19x compared to Palimpzest and Quest.

翻译：摘要：随着大语言模型（LLM）的出现，许多数据库系统引入了语义运算符，从而支持对非结构化数据（如文本、图像、视频）的分析查询。语义运算符通常会产生高昂的推理成本和延迟，使得语义（AI）SQL查询难以应用于大规模数据集。同时，其语义特性导致数据库引擎将其视为黑盒，进一步增加了AISQL查询的优化难度。本文提出Larch框架，用于优化AI SQL查询中语义过滤器的执行。Larch的提出基于两个关键观察：（i）语义运算符的高延迟为计算密集型的运行时优化技术留出了充足空间；（ii）非结构化数据通常以嵌入向量的形式附带语义信息，使得AI_FILTER提示与数据值之间能够进行高效的语义比较。基于这两个关键观察，我们提出了两种Larch变体：Larch-A2C和Larch-Sel。Larch-A2C通过嵌入增强的门控图神经网络对任意语义过滤器表达式树进行编码，并将过滤器评估顺序建模为马尔可夫决策过程。相比之下，Larch-Sel利用监督学习模型预测过滤器选择度，随后采用动态规划为每行输入数据寻找近似最优的评估顺序。在多样化的真实数据集和综合合成工作负载上的评估表明，两种Larch变体在令牌使用量方面始终优于现有语义过滤器优化技术。我们的结果证明Larch在不同工作负载下具有鲁棒性，与Palimpzest和Quest相比，可将总令牌成本开销降低3倍至19倍。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

【AAAI2026】NeSTR：一种用于大型语言模型的神经-符号可溯因框架，用于时间推理

专知会员服务

17+阅读 · 2025年12月10日

《数据扩散化：基于大语言模型的海军陆战队数据智能查询系统》

专知会员服务

24+阅读 · 2025年11月11日

基于强化学习的智能体化搜索全面综述：基础、角色、优化、评估与应用

专知会员服务

23+阅读 · 2025年10月22日

LaCache：用于高效长上下文建模的大语言模型梯状KV缓存机制

专知会员服务

11+阅读 · 2025年7月23日