Regular path queries (RPQs) are an essential component of graph query languages. Such queries consider a regular expression r and a directed edge-labeled graph G and search for paths in G for which the sequence of labels is in the language of r. In order to avoid having to consider infinitely many paths, some database engines restrict such paths to be trails, that is, they only consider paths without repeated edges. In this paper we consider the evaluation problem for RPQs under trail semantics, in the case where the expression is fixed. We show that, in this setting, there exists a trichotomy. More precisely, the complexity of RPQ evaluation divides the regular languages into the finite languages, the class Ttract (for which the problem is tractable), and the rest. Interestingly, the tractable class in the trichotomy is larger than for the trichotomy for simple paths, discovered by Bagan, Bonifati, and Groz [JCSS 2020]. In addition to this trichotomy result, we also study characterizations of the tractable class, its expressivity, the recognition problem, closure properties, and show how the decision problem can be extended to the enumeration problem, which is relevant to practice.
翻译:正则路径查询(RPQs)是图查询语言的核心组成部分。此类查询考虑正则表达式 r 和有向边标签图 G,在 G 中搜索标签序列属于 r 语言的路径。为避免考虑无限多条路径,部分数据库引擎将这些路径限制为迹(trails),即仅考虑不含重复边的路径。本文针对表达式固定情形,研究迹语义下 RPQs 的评估问题。我们证明在此设定下存在三分性:具体而言,RPQ 评估的复杂度将正则语言划分为有限语言类、Ttract 类(问题可解类)及其他语言类。有趣的是,该三分性中的可解类比 Bagan、Bonifati 和 Groz 发现的简单路径三分性([JCSS 2020])中的可解类更大。除该三分性结果外,我们还研究了可解类的特征化、表达力、识别问题、闭包性质,并展示了如何将判定问题扩展至与实践相关的枚举问题。