The resilience problem for a query and an input set or bag database is to compute the minimum number of facts to remove from the database to make the query false. In this paper, we study how to compute the resilience of Regular Path Queries (RPQs) over graph databases. Our goal is to characterize the regular languages $L$ for which it is tractable to compute the resilience of the existentially-quantified RPQ built from $L$. We show that computing the resilience in this sense is tractable (even in combined complexity) for all RPQs defined from so-called local languages. By contrast, we show hardness in data complexity for RPQs defined from the following language classes (after reducing the languages to eliminate redundant words): all finite languages featuring a word containing a repeated letter, and all languages featuring a specific kind of counterexample to being local (which we call four-legged languages). The latter include in particular all languages that are not star-free. Our results also imply hardness for all non-local languages with a so-called neutral letter. We also highlight some remaining obstacles towards a full dichotomy. In particular, for the RPQ $abc|be$, resilience is tractable but the only PTIME algorithm that we know uses submodular function optimization.
翻译:查询相对于输入集合或包数据库的韧性问题,是指计算为使查询结果为假而需从数据库中移除的最小事实数量。本文研究如何计算图数据库上正则路径查询(RPQ)的韧性。我们的目标是刻画一类正则语言$L$,使得基于$L$构建的存在量化RPQ的韧性计算是易处理的。我们证明,对于所有由所谓局部语言定义的RPQ,在这种意义下计算韧性是易处理的(甚至在组合复杂性下)。相比之下,我们证明对于以下语言类定义的RPQ(在约简语言以消除冗余词后),其在数据复杂性上是困难的:所有包含重复字母单词的有限语言,以及所有具有特定非局部反例(我们称之为四足语言)的语言。后者特别包含所有非星号自由语言。我们的结果也暗示了所有具有所谓中性字母的非局部语言的困难性。我们还指出了实现完全二分法的剩余障碍。特别地,对于RPQ $abc|be$,韧性计算是易处理的,但我们已知的唯一多项式时间算法需要使用子模函数优化方法。