We introduce a general abstract framework for database repairing in which the repair notions are defined using formal logic. We differentiate between integrity constraints and the so-called query constraints. The former are used to model consistency and desirable properties of the data (such as functional dependencies and independencies), while the latter relates two database instances according to their answers for the query constraints. The framework also admits a distinction between hard and soft queries, allowing to preserve the answers of a core set of queries as well as defining a distance between instances based on query answers. We exemplify how various notions of repairs from the literature can be modelled in our unifying framework. Furthermore, we initiate a complexity-theoretic analysis of the problems of consistent query answering, repair computation, and existence of repair within the new framework. We present both coNP- and NP-hard cases that illustrate the interplay between computationally hard problems and more flexible repair notions. We show general upper bounds in NP and the second level of the polynomial hierarchy. Finally, we relate the existence of a repair to model checking of existential second-order logic.
翻译:我们提出一个通用的抽象数据库修复框架,其中修复概念通过形式逻辑定义。我们将完整性约束与所谓的查询约束加以区分:前者用于刻画数据的一致性和期望属性(如函数依赖与独立性),而后者则依据查询约束的答案关联两个数据库实例。该框架还允许区分硬查询与软查询,从而既能保留核心查询集的答案,又能基于查询答案定义实例间的距离。我们通过示例展示了文献中多种修复概念如何在该统一框架中建模。此外,我们对该新框架中一致查询回答、修复计算及修复存在性问题的计算复杂性进行了分析,提出了coNP-和NP-困难情形,揭示了计算困难问题与更灵活修复概念之间的相互作用,并给出了NP及多项式层次第二层的一般上界。最后,我们将修复的存在性问题与存在二阶逻辑的模型检验相关联。