Graph databases are gaining momentum thanks to the flexibility and expressiveness of their data model and query languages. A standardization activity driven by the ISO/IEC standardization body is also ongoing and has already conducted to the specification of the first versions of two standard graph query languages, namely SQL/PGQ and GQL, respectively in 2023 and 2024. Apart from the standards, there exists a panoply of concrete graph query languages in commercial and open-source graph databases, each of which exhibits different features and modes. In this paper, we tackle the heterogeneity problem of graph query languages by laying the foundations of a unifying path-oriented algebraic framework. Such a theoretical framework is currently missing in the graph databases landscape, thus impeding a lingua franca in which different graph query language implementations can be expressed and cross-compared. Our framework gives a blueprint for correct implementation of graph queries of different expressiveness. It allows to overcome the boundaries of current versions of standard query languages, thus paving the way to future extensions including query composability. It also allows, when the path-based semantics is stripped off, to express classical Codd's relational algebra enhanced with a recursive operator, thus proving its utility for a wide range of queries in database management systems.
翻译:图数据库因其数据模型和查询语言的灵活性与表达能力而日益受到关注。由ISO/IEC标准化组织推动的标准化工作也正在进行中,并已于2023年和2024年分别制定了两个标准图查询语言的首个版本规范,即SQL/PGQ和GQL。除了标准之外,商业和开源图数据库中还存在多种具体的图查询语言,每种语言都展现出不同的特性和模式。本文通过建立一个统一的、面向路径的代数框架基础,来解决图查询语言的异构性问题。目前图数据库领域尚缺乏这样的理论框架,从而阻碍了不同图查询语言实现得以表达和交叉比较的通用语的形成。我们的框架为不同表达能力图查询的正确实现提供了蓝图。它能够克服当前标准查询语言版本的局限性,从而为包括查询可组合性在内的未来扩展铺平道路。此外,当剥离基于路径的语义时,该框架还能表达增强了递归算子的经典Codd关系代数,从而证明了其在数据库管理系统中广泛查询场景下的实用性。