We consider the Distinct Shortest Walks problem. Given two vertices $s$ and $t$ of a graph database $\mathcal{D}$ and a regular path query, enumerate all walks of minimal length from $s$ to $t$ that carry a label that conforms to the query. Usual theoretical solutions turn out to be inefficient when applied to graph models that are closer to real-life systems, in particular because edges may carry multiple labels. Indeed, known algorithms may repeat the same answer exponentially many times. We propose an efficient algorithm for multi-labelled graph databases. The preprocessing runs in $O{|\mathcal{D}|\times|\mathcal{A}|}$ and the delay between two consecutive outputs is in $O(\lambda\times|\mathcal{A}|)$, where $\mathcal{A}$ is a nondeterministic automaton representing the query and $\lambda$ is the minimal length. The algorithm can handle $\varepsilon$-transitions in $\mathcal{A}$ or queries given as regular expressions at no additional cost.
翻译:我们考虑"最短不同路径"问题。给定图数据库 $\mathcal{D}$ 中的两个顶点 $s$ 和 $t$ 以及一个正则路径查询,枚举所有从 $s$ 到 $t$ 且标签符合查询的最短长度路径。由于边可能携带多个标签,通常的理论解决方案在应用于更接近真实系统的图模型时效率低下,具体表现为已知算法可能以指数级次数重复输出相同结果。我们提出了一种适用于多标签图数据库的高效算法。预处理阶段的时间复杂度为 $O{|\mathcal{D}|\times|\mathcal{A}|}$,连续两次输出之间的延迟为 $O(\lambda\times|\mathcal{A}|)$,其中 $\mathcal{A}$ 是表示查询的非确定性自动机,$\lambda$ 是最短路径长度。该算法可在不增加额外开销的情况下处理 $\mathcal{A}$ 中的 $\varepsilon$ 转移或以正则表达式形式给出的查询。