Meta Engine: A Unified Semantic Query Engine on Heterogeneous LLM-Based Query Systems

With the increasingly use of multi-modal data, semantic query has become more and more demanded in data management systems, which is an important way to access and analyze multi-modal data. As unstructured data, most information of multi-modal data (text, image, video, etc) hides in the semantics, which cannot be accessed by the traditional database queries like SQL. Given the power of Large Language Model (LLM) in understanding semantics and processing natural language, in recent years several LLM-based semantic query systems have been proposed, to support semantic querying over unstructured data. However, this rapid growth has produced a fragmented ecosystem. Applications face significant integration challenges due to (1) disparate APIs of different semantic query systems and (2) a fundamental trade-off between specialization and generality. Many semantic query systems are highly specialized, offering state-of-the-art performance within a single modality but struggling with multi-modal data. Conversely, some "all-in-one" systems handle multiple modalities but often exhibit suboptimal performance compared to their specialized counterparts in specific modalities. This paper introduces Meta Engine, a novel "query system on query systems", designed to resolve those aforementioned challenges. Meta Engine is a unified semantic query engine that integrates heterogeneous, specialized LLM-based query systems. Its architecture comprises five key components: (1) a Natural Language (NL) Query Parser, (2) an Operator Generator, (3) a Query Router, (4) a set of Adapters, and (5) a Result Aggregator. In the evaluation, Meta Engine consistently outperforms all baselines, yielding 3-6x higher F1 in most cases and up to 24x on specific datasets.

翻译：随着多模态数据的日益广泛应用，语义查询在数据管理系统中需求愈发迫切，成为访问和分析多模态数据的重要途径。作为非结构化数据，多模态数据（文本、图像、视频等）的大部分信息隐藏于语义之中，无法通过SQL等传统数据库查询进行访问。鉴于大语言模型在理解语义和处理自然语言方面的强大能力，近年来已提出多个基于LLM的语义查询系统，以支持对非结构化数据的语义查询。然而，这种快速增长导致了生态系统的碎片化。由于（1）不同语义查询系统各异的API接口，以及（2）专业化与通用性之间的根本性权衡，应用程序面临显著的集成挑战。许多语义查询系统高度专业化，在单一模态内提供最先进的性能，但在处理多模态数据时表现欠佳。反之，某些"一体化"系统虽能处理多种模态，但在特定模态上的性能往往逊于相应的专业化系统。本文提出元引擎，一种创新的"查询系统之上的查询系统"，旨在解决上述挑战。元引擎是一个统一语义查询引擎，可集成异构的、专业化的基于LLM的查询系统。其架构包含五个核心组件：（1）自然语言查询解析器，（2）操作符生成器，（3）查询路由器，（4）适配器集合，以及（5）结果聚合器。在评估中，元引擎始终优于所有基线系统，在多数情况下获得3-6倍的F1值提升，在特定数据集上最高可达24倍。