Machine learning with Semantic Web ontologies follows several strategies, one of which involves projecting ontologies into graph structures and applying graph embeddings or graph-based machine learning methods to the resulting graphs. Several methods have been developed that project ontology axioms into graphs. However, these methods are limited in the type of axioms they can project (totality), whether they are invertible (injectivity), and how they exploit semantic information. These limitations restrict the kind of tasks to which they can be applied. Category-theoretical semantics of logic languages formalizes interpretations using categories instead of sets, and categories have a graph-like structure. We developed CatE, which uses the category-theoretical formulation of the semantics of the Description Logic $\mathcal{ALC}$ to generate a graph representation for ontology axioms. The CatE projection is total and injective, and therefore overcomes limitations of other graph-based ontology embedding methods which are generally not invertible. We apply CatE to a number of different tasks, including deductive and inductive reasoning, and we demonstrate that CatE improves over state of the art ontology embedding methods. Furthermore, we show that CatE can also outperform model-theoretic ontology embedding methods in machine learning tasks in the biomedical domain.
翻译:利用语义网本体进行机器学习遵循多种策略,其中之一是将本体投影为图结构,并对生成的图应用图嵌入或基于图的机器学习方法。目前已开发出多种将本体公理投影为图的方法。然而,这些方法在可投影的公理类型(完备性)、是否可逆(单射性)以及如何利用语义信息方面存在局限性。这些限制制约了它们能够适用的任务类型。逻辑语言的范畴论语义使用范畴(而非集合)形式化解释,而范畴具有图结构特性。我们开发了CatE方法,该方法利用描述逻辑$\mathcal{ALC}$的范畴论语义公式,为本体公理生成图表示。CatE投影具有完备性和单射性,因此克服了其他基于图的本体嵌入方法(通常不可逆)的局限性。我们将CatE应用于包括演绎推理和归纳推理在内的多项不同任务,并证明CatE优于现有最先进的本体嵌入方法。此外,我们表明CatE在生物医学领域的机器学习任务中也能超越基于模型论的本体嵌入方法。