With the increasing use of multi-modal data, semantic query has become more and more demanded in data management systems, which is an important way to access and analyze multi-modal data. As unstructured data, most information of multi-modal data (text, image, video, etc.) hides in the semantics, which cannot be accessed by traditional database queries like SQL. Given the power of Large Language Models (LLMs) in understanding semantics and processing natural language, in recent years several LLM-based semantic query systems have been proposed to support semantic querying over unstructured data. However, this rapid growth has produced a fragmented ecosystem. Applications face significant integration challenges due to (1) disparate APIs of different semantic query systems and (2) a fundamental trade-off between specialization and generality. Many semantic query systems are highly specialized, offering state-of-the-art performance within a single modality but struggling with multi-modal data. Conversely, some "all-in-one" systems handle multiple modalities but often exhibit suboptimal performance compared to their specialized counterparts in specific modalities. This paper introduces Meta Engine, a novel ``query system on query systems'', designed to resolve those aforementioned challenges. Meta Engine is a unified semantic query engine that integrates heterogeneous, specialized LLM-based query systems. Its architecture comprises five key components: (1) a Natural Language (NL) Query Parser, (2) an Operator Generator, (3) a Query Router, (4) a set of Adapters, and (5) a Result Aggregator. In the evaluation, Meta Engine consistently outperforms all baselines, yielding 3--6x higher F1 in most cases and up to ~24x on specific datasets.
翻译:随着多模态数据的日益普及,数据管理系统对语义查询的需求不断增长,这已成为访问和分析多模态数据的重要途径。作为非结构化数据,多模态数据(文本、图像、视频等)的大部分信息隐藏于语义之中,无法通过SQL等传统数据库查询进行访问。鉴于大语言模型(LLMs)在理解语义和处理自然语言方面的强大能力,近年来已提出多种基于LLM的语义查询系统以支持对非结构化数据的语义查询。然而,这种快速增长导致了生态系统的碎片化。应用面临显著的集成挑战,原因在于:(1)不同语义查询系统的异构API;(2)专业化与通用性之间的根本权衡。许多语义查询系统高度专业化,在单一模态内提供最先进的性能,但在处理多模态数据时面临困难。反之,一些“一体化”系统虽能处理多种模态,但在特定模态上的性能通常逊于专门的对应系统。本文提出Meta Engine,一种新颖的“查询系统之上的查询系统”,旨在解决上述挑战。Meta Engine是一个统一的语义查询引擎,集成了异构的、专门的基于LLM的查询系统。其架构包含五个关键组件:(1)自然语言查询解析器,(2)操作符生成器,(3)查询路由器,(4)适配器集合,以及(5)结果聚合器。在评估中,Meta Engine始终优于所有基线方法,在大多数情况下获得3-6倍的F1分数提升,在特定数据集上最高可达约24倍。