Graph databases are gaining momentum thanks to the flexibility and expressiveness of their data models and query languages. A standardization activity driven by the ISO/IEC standardization body is also ongoing and has already conducted to the specification of the first versions of two standard graph query languages, namely SQL/PGQ and GQL, respectively in 2023 and 2024. Apart from the standards, there exists a panoply of concrete graph query languages provided by current graph database systems, each offering different query features. A common limitation of current graph query engines is the absence of an algebraic approach for evaluating path queries. To address this, we introduce an abstract algebra for evaluating path queries, allowing paths to be treated as first-class entities within the query processing pipeline. We demonstrate that our algebra can express a core fragment of path queries defined in GQL and SQL/PGQ, thereby serving as a formal framework for studying both standards and supporting their implementation in current graph database systems. We also show that evaluation trees for path algebra expressions can function as logical plans for evaluating path queries and enable the application of query optimization techniques. Our algebraic framework has the potential to act as a lingua franca for path query evaluation, enabling different implementations to be expressed and compared.
翻译:图数据库因其数据模型和查询语言的灵活性与表达能力而日益受到关注。由ISO/IEC标准化组织推动的标准化工作也在持续进行,并已于2023年和2024年分别完成了两种标准图查询语言——SQL/PGQ与GQL——首个版本的规范制定。除标准外,现有图数据库系统提供了多种具体的图查询语言,各自具备不同的查询特性。当前图查询引擎的一个普遍局限在于缺乏用于评估路径查询的代数方法。为此,我们提出了一种用于评估路径查询的抽象代数,使得路径能够在查询处理流程中被视为一等实体。我们证明该代数能够表达GQL和SQL/PGQ中定义的路径查询核心片段,从而可作为研究这两项标准的形式化框架,并支持其在现有图数据库系统中的实现。我们还表明,路径代数表达式的评估树可作为执行路径查询的逻辑计划,并支持查询优化技术的应用。我们的代数框架有望成为路径查询评估的通用基础,使得不同实现方案能够被形式化表达与比较。