The problem of repairing inconsistent knowledge bases has a long history within the communities of database theory and knowledge representation and reasoning, especially from the perspective of structured data. However, as the data available in real-world domains becomes more complex and interconnected, the need naturally arises for developing new types of repositories, representation languages, and semantics, to allow for more suitable ways to query and reason about it. Graph databases provide an effective way to represent relationships among semi-structured data, and allow processing and querying these connections efficiently. In this work, we focus on the problem of computing prioritized repairs over graph databases with data values, using a notion of consistency based on Reg-GXPath expressions as integrity constraints. We present several preference criteria based on the standard subset repair semantics, incorporating weights, multisets, and set-based priority levels. We study the most common repairing tasks, showing that it is possible to maintain the same computational complexity as in the case where no preference criterion is available for exploitation. To complete the picture, we explore the complexity of consistent query answering in this setting and obtain tight lower and upper bounds for all the preference criteria introduced.
翻译:修复不一致知识库的问题在数据库理论以及知识表示与推理领域有着悠久的历史,尤其是从结构化数据的角度来看。然而,随着现实领域中可用的数据变得日益复杂和互联,自然需要开发新型的存储库、表示语言和语义,以支持更合适的查询和推理方式。图数据库提供了一种高效表示半结构化数据之间关系的方法,并能有效处理和查询这些连接。在本工作中,我们聚焦于计算数据值图数据库上的优先级修复问题,使用基于Reg-GXPath表达式的一致性概念作为完整性约束。我们基于标准子集修复语义提出了几种偏好标准,这些标准融合了权重、多重集和基于集合的优先级层级。我们研究了最常见的修复任务,结果表明可以在不利用任何偏好标准的情况下保持相同的计算复杂度。为完善整体图景,我们探讨了该场景中一致查询回答的复杂度,并为所有引入的偏好标准建立了紧的下界和上界。