Regular path queries (RPQs) are an essential component of graph query languages. Such queries consider a regular expression r and a directed edge-labeled graph G and search for paths in G for which the sequence of labels is in the language of r. In order to avoid having to consider infinitely many paths, some database engines restrict such paths to be trails, that is, they only consider paths without repeated edges. In this paper we consider the evaluation problem for RPQs under trail semantics, in the case where the expression is fixed. We show that, in this setting, there exists a trichotomy. More precisely, the complexity of RPQ evaluation divides the regular languages into the finite languages, the class Ttract (for which the problem is tractable), and the rest. Interestingly, the tractable class in the trichotomy is larger than for the trichotomy for simple paths, discovered by Bagan, Bonifati, and Groz [JCSS 2020]. In addition to this trichotomy result, we also study characterizations of the tractable class, its expressivity, the recognition problem, closure properties, and show how the decision problem can be extended to the enumeration problem, which is relevant to practice.
翻译:正则路径查询(RPQ)是图查询语言的重要组成部分。此类查询考虑正则表达式 r 和带边标签的有向图 G,并搜索 G 中标签序列属于 r 所定义语言的路径。为避免需考虑无限多条路径,部分数据库引擎将此类路径限制为迹(trail),即仅考虑无重复边的路径。本文研究在表达式固定的情况下,迹语义下 RPQ 的评估问题。我们证明,在此设定下存在一种三分法:具体而言,RPQ评估的复杂性将正则语言划分为有限语言、类 Ttract(问题可处理)及其他语言。值得注意的是,该三分法中的可处理类比 Bagan、Bonifati 和 Groz [JCSS 2020] 发现的简单路径三分法中的可处理类更大。除这一三分法结果外,我们还研究了可处理类的表征、其表达能力、识别问题、封闭性质,并展示了如何将决策问题扩展至与实际问题相关的枚举问题。