Large knowledge graphs capture information of a large number of entities and their relations. Among the many relations they capture, class subsumption assertions are usually present and expressed using the \texttt{rdfs:subClassOf} construct. From our examination, publicly available knowledge graphs contain many potentially erroneous cyclic subclass relations, a problem that can be exacerbated when different knowledge graphs are integrated as Linked Open Data. In this paper, we present an automatic approach for resolving such cycles at scale using automated reasoning by encoding the problem of cycle-resolving to a MAXSAT solver. The approach is tested on the LOD-a-lot dataset, and compared against a semi-automatic version of our algorithm. We show how the number of removed triples is a trade-off against the efficiency of the algorithm.
翻译:大型知识图谱能够捕获大量实体及其关系的相关信息。在众多关系中,类包含关系通常通过\texttt{rdfs:subClassOf}结构进行表达。根据我们的考察,当前公开可用的知识图谱中存在大量可能存在错误的循环子类关系,当不同知识图谱作为关联开放数据进行集成时,这一问题可能进一步加剧。本文提出一种基于自动推理的大规模循环消解方法,通过将循环消解问题编码为MAXSAT求解器可处理的形式来实现。该方法在LOD-a-lot数据集上进行测试,并与我们算法的半自动版本进行对比。我们展示了移除三元组的数量与算法效率之间的权衡关系。