We consider the following fundamental problem: given a database D, Boolean conjunctive query (CQ) q, and fact f in D, decide whether f is relevant to q wrt. D, i.e., does f belong to a minimal subset S of D such that S |= q. Despite being of central importance to query answer explanation, the combined complexity of deciding query relevance has not been studied in detail, leaving open what makes this problem hard, and which restrictions can yield lower complexity. Relevance has already been shown to be harder than query evaluation: namely, $Σ^p_2$-complete for CQs, even over a binary signature. We further observe that NP-hardness applies already to (acyclic) chain CQs. Our work identifies self-joins (multiple atoms with the same relation) as the culprit. Indeed, we prove that if we forbid or bound the occurrence of self-joins, then relevance has the same complexity as query evaluation, namely, NP (without structural restrictions) and LogCFL (for bounded hypertreewidth classes). In the ontology setting, we establish an analogous result for ontology-mediated queries consisting of a CQ and DL-Lite_R ontology, namely that relevance is no harder than query answering provided that we bound the interaction width (which generalizes both self-join width and a recently introduced 'interaction-free' condition). Our results thus pinpoint what makes relevance harder than query evaluation and identify natural classes of queries which admit efficient relevance computation.
翻译:我们考虑以下基本问题:给定数据库D、布尔合取查询(CQ)q以及D中的事实f,判断f相对于D是否与q相关,即f是否属于满足S |= q的最小子集S⊆D。尽管该问题对查询答案解释具有核心重要性,但决定查询相关性的组合复杂度尚未得到详细研究,导致我们不清楚该问题的难点所在,以及哪些限制条件能降低其复杂度。已有研究表明相关性比查询求值更难:即对于CQ(即使在二元签名下)为$Σ^p_2$-完全问题。我们进一步观察到NP难度已适用于(无环)链式CQ。本研究将自连接(具有相同关系的多个原子)确定为关键因素。事实上,我们证明若禁止或限制自连接的出现,则相关性与查询求值具有相同复杂度,即NP(无结构限制)和有界超树宽度类别的LogCFL。在本体论场景中,我们为包含CQ与DL-Lite_R本体的本体中介查询建立了类似结论:只要限制交互宽度(该概念同时概括了自连接宽度和近期提出的"无交互"条件),相关性就不比查询回答更难。我们的研究结果从而精确界定了导致相关性比查询求值更难的根源,并识别出能实现高效相关性计算的自然查询类别。