Entity alignment (EA) aims to discover the equivalent entities in different knowledge graphs (KGs), which play an important role in knowledge engineering. Recently, EA with dangling entities has been proposed as a more realistic setting, which assumes that not all entities have corresponding equivalent entities. In this paper, we focus on this setting. Some work has explored this problem by leveraging translation API, pre-trained word embeddings, and other off-the-shelf tools. However, these approaches over-rely on the side information (e.g., entity names), and fail to work when the side information is absent. On the contrary, they still insufficiently exploit the most fundamental graph structure information in KG. To improve the exploitation of the structural information, we propose a novel entity alignment framework called Weakly-Optimal Graph Contrastive Learning (WOGCL), which is refined on three dimensions : (i) Model. We propose a novel Gated Graph Attention Network to capture local and global graph structure similarity. (ii) Training. Two learning objectives: contrastive learning and optimal transport learning are designed to obtain distinguishable entity representations via the optimal transport plan. (iii) Inference. In the inference phase, a PageRank-based method is proposed to calculate higher-order structural similarity. Extensive experiments on two dangling benchmarks demonstrate that our WOGCL outperforms the current state-of-the-art methods with pure structural information in both traditional (relaxed) and dangling (consolidated) settings. The code will be public soon.
翻译:实体对齐(Entity Alignment,EA)旨在发现不同知识图谱(Knowledge Graphs,KGs)中的等价实体,在知识工程中扮演着重要角色。近年来,包含悬空实体的实体对齐被提出作为一种更符合实际的设置,该设置假设并非所有实体都有对应的等价实体。本文聚焦于这一设置。已有研究通过利用翻译API、预训练词嵌入及其他现成工具探索该问题。然而,这些方法过度依赖侧边信息(如实体名称),当侧边信息缺失时则无法工作。相反,它们仍未能充分挖掘知识图谱中最基础的图结构信息。为提升结构信息的利用效率,我们提出了一种名为弱最优图对比学习(Weakly-Optimal Graph Contrastive Learning,WOGCL)的新型实体对齐框架,该框架在三个维度上进行了优化:(i)模型。我们提出了一种新型门控图注意力网络,用于捕获局部与全局图结构相似性。(ii)训练。设计了对比学习与最优传输学习两种学习目标,通过最优传输方案获得可区分的实体表示。(iii)推理。在推理阶段,提出一种基于PageRank的方法计算高阶结构相似性。在两个悬空实体基准上的广泛实验表明,我们的WOGCL在仅使用纯结构信息的情况下,在传统(宽松)设置与悬空实体(统一)设置中均优于当前最先进方法。代码将很快公开。