Knowledge graphs (KGs), which store an extensive number of relational facts (head, relation, tail), serve various applications. While many downstream tasks highly rely on the expressive modeling and predictive embedding of KGs, most of the current KG representation learning methods, where each entity is embedded as a vector in the Euclidean space and each relation is embedded as a transformation, follow an entity ranking protocol. On one hand, such an embedding design cannot capture many-to-many relations. On the other hand, in many retrieval cases, the users wish to get an exact set of answers without any ranking, especially when the results are expected to be precise, e.g., which genes cause an illness. Such scenarios are commonly referred to as "set retrieval". This work presents a pioneering study on the KG set retrieval problem. We show that the set retrieval highly depends on expressive modeling of many-to-many relations, and propose a new KG embedding model SpherE to address this problem. SpherE is based on rotational embedding methods, but each entity is embedded as a sphere instead of a vector. While inheriting the high interpretability of rotational-based models, our SpherE can more expressively model one-to-many, many-to-one, and many-to-many relations. Through extensive experiments, we show that our SpherE can well address the set retrieval problem while still having a good predictive ability to infer missing facts. The code is available at https://github.com/Violet24K/SpherE.
翻译:知识图谱(KGs)存储大量关系事实(头实体、关系、尾实体),服务于多种应用。尽管许多下游任务高度依赖知识图谱的表达性建模与预测性嵌入,但当前大多数知识图谱表示学习方法——将每个实体嵌入为欧氏空间中的向量、每个关系嵌入为一种变换——均遵循实体排序协议。一方面,这种嵌入设计无法捕捉多对多关系;另一方面,在许多检索场景中,用户希望获得精确的答案集合而非排序结果,尤其在预期结果需要高精度的情况下(例如,哪些基因导致某种疾病)。此类场景通常被称为"集合检索"。本文首次对知识图谱集合检索问题展开开创性研究。我们证明集合检索高度依赖多对多关系的表达性建模,并提出一种新的知识图谱嵌入模型SpherE来解决该问题。SpherE基于旋转嵌入方法,但将每个实体嵌入为球体而非向量。在继承旋转模型高可解释性的同时,SpherE能更表达性地建模一对多、多对一和多对多关系。通过大量实验,我们证明SpherE能有效解决集合检索问题,同时仍具备良好的预测能力以推断缺失事实。代码开源于https://github.com/Violet24K/SpherE。