Wikidata is a knowledge graph increasingly adopted by many communities for diverse applications. Wikidata statements are annotated with qualifier-value pairs that are used to depict information, such as the validity context of the statement, its causality, provenances, etc. Handling the qualifiers in reasoning is a challenging problem. When defining inference rules (in particular, rules on ontological properties (x subclass of y, z instance of x, etc.)), one must consider the qualifiers, as most of them participate in the semantics of the statements. This poses a complex problem because a) there is a massive number of qualifiers, and b) the qualifiers of the inferred statement are often a combination of the qualifiers in the rule condition. In this work, we propose to address this problem by a) defining a categorization of the qualifiers b) formalizing the Wikidata model with a many-sorted logical language; the sorts of this language are the qualifier categories. We couple this logic with an algebraic specification that provides a means for effectively handling qualifiers in inference rules. Using Wikidata ontological properties, we show how to use the MSL and specification to reason on qualifiers. Finally, we discuss the methodology for practically implementing the work and present a prototype implementation. The work can be naturally extended, thanks to the extensibility of the many-sorted algebraic specification, to cover more qualifiers in the specification, such as uncertain time, recurring events, geographic locations, and others.
翻译:维基数据是一个知识图谱,正被越来越多社群用于各类应用。维基数据陈述均带有限定符-值对,用于描述陈述的有效性上下文、因果关系、来源等信息。在推理过程中处理限定符是一项具有挑战性的问题。当定义推理规则(特别是本体属性规则,如x是y的子类、z是x的实例等)时,必须考虑限定符,因为大多数限定符参与陈述的语义。这带来了复杂问题:a)存在大量限定符,b)推导出的陈述的限定符通常是规则条件中限定符的组合。本研究提出通过以下方式解决该问题:a)对限定符进行分类,b)使用多类逻辑语言对维基数据模型进行形式化表达;该语言的种类即为限定符类别。我们将此逻辑与代数规范结合,为在推理规则中有效处理限定符提供方法。利用维基数据的本体属性,展示了如何使用多类逻辑语言和规范对限定符进行推理。最后,讨论了该工作的实际实施方法,并给出了原型实现方案。得益于多类代数规范的可扩展性,本工作可自然扩展,以涵盖规范中更多限定符,如不确定时间、重复事件、地理位置等。