Entity Alignment (EA) is to link potential equivalent entities across different knowledge graphs (KGs). Most existing EA methods are supervised as they require the supervision of seed alignments, i.e., manually specified aligned entity pairs. Very recently, several EA studies have made some attempts to get rid of seed alignments. Despite achieving preliminary progress, they still suffer two limitations: (1) The entity embeddings produced by their GNN-like encoders lack personalization since some of the aggregation subpaths are shared between different entities. (2) They cannot fully alleviate the distribution distortion issue between candidate KGs due to the absence of the supervised signal. In this work, we propose a novel unsupervised entity alignment approach called UNEA to address the above two issues. First, we parametrically sample a tree neighborhood rooted at each entity, and accordingly develop a tree attention aggregation mechanism to extract a personalized embedding for each entity. Second, we introduce an auxiliary task of maximizing the mutual information between the input and the output of the KG encoder, to regularize the model and prevent the distribution distortion. Extensive experiments show that our UNEA achieves a new state-of-the-art for the unsupervised EA task, and can even outperform many existing supervised EA baselines.
翻译:实体对齐(EA)旨在链接不同知识图谱(KG)中潜在的等价实体。现有的大多数EA方法属于有监督方法,因为它们需要种子对齐(即人工指定的已对齐实体对)的监督。最近,一些EA研究尝试摆脱对种子对齐的依赖。尽管取得了初步进展,这些方法仍存在两个局限性:(1)由于其类GNN编码器产生的实体嵌入缺乏个性化,部分聚合子路径在不同实体间是共享的。(2)由于缺乏监督信号,它们无法完全缓解候选知识图谱间的分布失真问题。本文提出了一种名为UNEA的新型无监督实体对齐方法以解决上述两个问题。首先,我们通过参数化采样为每个实体生成一个根植于该实体的树状邻域,并据此设计了一种树注意力聚合机制来提取每个实体的个性化嵌入。其次,我们引入了一项最大化知识图谱编码器输入与输出间互信息的辅助任务,以正则化模型并防止分布失真。大量实验表明,我们的UNEA方法在无监督实体对齐任务上达到了新的最优性能,甚至能够超越许多现有的有监督EA基线方法。