Wikidata is a knowledge graph increasingly adopted by many communities for diverse applications. Wikidata statements are annotated with qualifier-value pairs that are used to depict information, such as the validity context of the statement, its causality, provenances, etc. Handling the qualifiers in reasoning is a challenging problem. When defining inference rules (in particular, rules on ontological properties (x subclass of y, z instance of x, etc.)), one must consider the qualifiers, as most of them participate in the semantics of the statements. This poses a complex problem because a) there is a massive number of qualifiers, and b) the qualifiers of the inferred statement are often a combination of the qualifiers in the rule condition. In this work, we propose to address this problem by a) defining a categorization of the qualifiers b) formalizing the Wikidata model with a many-sorted logical language; the sorts of this language are the qualifier categories. We couple this logic with an algebraic specification that provides a means for effectively handling qualifiers in inference rules. The work supports the expression of all current Wikidata ontological properties. Finally, we discuss the methodology for practically implementing the work and present a prototype implementation.
翻译:摘要:维基数据是一个知识图谱,正被越来越多的社区广泛应用于各类场景。维基数据陈述通过限定符-值对进行注释,用于描述信息的有效性上下文、因果关系、来源等属性。在推理过程中处理限定符是一项具有挑战性的问题。当定义推理规则(特别是关于本体属性(如x是y的子类、z是x的实例等)的规则)时,必须考虑限定符,因为大多数限定符参与陈述的语义表达。这带来了复杂问题:a) 限定符数量庞大,b) 推理得出的陈述的限定符通常是规则条件中限定符的组合。本研究提出通过以下方式解决该问题:a) 定义限定符的分类体系,b) 使用多类逻辑语言形式化维基数据模型,该语言的分类即为限定符类别。我们将此逻辑与代数规范相结合,提供在推理规则中有效处理限定符的方法。该工作支持表达当前所有维基数据本体属性。最后,我们讨论了实际实现该工作的方法论,并呈现了原型实现。