We consider the Distinct Shortest Walks problem. Given two vertices $s$ and $t$ of a graph database $\mathcal{D}$ and a regular path query, enumerate all walks of minimal length from $s$ to $t$ that carry a label that conforms to the query. Usual theoretical solutions turn out to be inefficient when applied to graph models that are closer to real-life systems, in particular because edges may carry multiple labels. Indeed, known algorithms may repeat the same answer exponentially many times. We propose an efficient algorithm for multi-labelled graph databases. The preprocessing runs in $O{|\mathcal{D}|\times|\mathcal{A}|}$ and the delay between two consecutive outputs is in $O(\lambda\times|\mathcal{A}|)$, where $\mathcal{A}$ is a nondeterministic automaton representing the query and $\lambda$ is the minimal length. The algorithm can handle $\varepsilon$-transitions in $\mathcal{A}$ or queries given as regular expressions at no additional cost.
翻译:我们考虑不同最短路径问题。给定图数据库$\mathcal{D}$中的两个顶点$s$和$t$以及一条正则路径查询,枚举所有从$s$到$t$且标签符合查询条件的最短长度路径。常规理论解决方案在应用于更接近真实系统的图模型时效率低下,特别是因为边可能携带多个标签。事实上,已知算法可能会以指数级次数重复相同的答案。我们提出了一种针对多标签图数据库的高效算法。预处理过程的时间复杂度为$O{|\mathcal{D}|\times|\mathcal{A}|}$,而两次连续输出之间的延迟为$O(\lambda\times|\mathcal{A}|)$,其中$\mathcal{A}$是表示查询的非确定性自动机,$\lambda$为最小路径长度。该算法能以无额外代价的方式处理$\mathcal{A}$中的$\varepsilon$转换或作为正则表达式给出的查询。