Knowledge graph embeddings (KGE) have been extensively studied to embed large-scale relational data for many real-world applications. Existing methods have long ignored the fact many KGs contain two fundamentally different views: high-level ontology-view concepts and fine-grained instance-view entities. They usually embed all nodes as vectors in one latent space. However, a single geometric representation fails to capture the structural differences between two views and lacks probabilistic semantics towards concepts' granularity. We propose Concept2Box, a novel approach that jointly embeds the two views of a KG using dual geometric representations. We model concepts with box embeddings, which learn the hierarchy structure and complex relations such as overlap and disjoint among them. Box volumes can be interpreted as concepts' granularity. Different from concepts, we model entities as vectors. To bridge the gap between concept box embeddings and entity vector embeddings, we propose a novel vector-to-box distance metric and learn both embeddings jointly. Experiments on both the public DBpedia KG and a newly-created industrial KG showed the effectiveness of Concept2Box.
翻译:知识图谱嵌入(KGE)已被广泛研究,用于为许多实际应用嵌入大规模关系数据。现有方法长期忽略了一个事实:许多知识图谱包含两种根本不同的视图:高层次的本体视图概念和细粒度的实例视图实体。它们通常将所有节点嵌入到一个潜在空间中的向量。然而,单一的几何表示无法捕捉两个视图之间的结构差异,并且缺乏关于概念粒度的概率语义。我们提出Concept2Box,一种新颖的方法,使用双重几何表示联合嵌入知识图谱的两个视图。我们用盒嵌入对概念进行建模,这可以学习概念间的层次结构以及重叠和不相交等复杂关系。盒体积可被解释为概念的粒度。不同于概念,我们将实体建模为向量。为了弥合概念盒嵌入和实体向量嵌入之间的差距,我们提出了一种新颖的向量到盒距离度量,并联合学习这两种嵌入。在公开的DBpedia知识图谱和新建的工业知识图谱上的实验显示了Concept2Box的有效性。