Cross-lingual entity alignment is the task of finding the same semantic entities from different language knowledge graphs. In this paper, we propose a simple and novel unsupervised method for cross-language entity alignment. We utilize the deep learning multi-language encoder combined with a machine translator to encode knowledge graph text, which reduces the reliance on label data. Unlike traditional methods that only emphasize global or local alignment, our method simultaneously considers both alignment strategies. We first view the alignment task as a bipartite matching problem and then adopt the re-exchanging idea to accomplish alignment. Compared with the traditional bipartite matching algorithm that only gives one optimal solution, our algorithm generates ranked matching results which enabled many potentials downstream tasks. Additionally, our method can adapt two different types of optimization (minimal and maximal) in the bipartite matching process, which provides more flexibility. Our evaluation shows, we each scored 0.966, 0.990, and 0.996 Hits@1 rates on the DBP15K dataset in Chinese, Japanese, and French to English alignment tasks. We outperformed the state-of-the-art method in unsupervised and semi-supervised categories. Compared with the state-of-the-art supervised method, our method outperforms 2.6% and 0.4% in Ja-En and Fr-En alignment tasks while marginally lower by 0.2% in the Zh-En alignment task.
翻译:跨语言实体对齐是从不同语言的知识图谱中寻找相同语义实体的任务。本文提出一种简单且新颖的无监督跨语言实体对齐方法。我们利用深度学习多语言编码器结合机器翻译器对知识图谱文本进行编码,从而减少对标注数据的依赖。与仅强调全局或局部对齐的传统方法不同,本文方法同时考虑两种对齐策略。首先将对齐任务视为二分图匹配问题,然后采用重新交换思想完成对齐。与仅给出一个最优解的传统二分图匹配算法相比,我们的算法生成排序后的匹配结果,从而支持多种潜在的下游任务。此外,该方法可在二分图匹配过程中适应两种不同类型的优化(最小化和最大化),提供更大灵活性。评估结果表明,在DBP15K数据集的中文、日语和法语到英语的对齐任务中,我们分别取得了0.966、0.990和0.996的Hits@1得分,超越了无监督和半监督类别中的最先进方法。与最先进的监督方法相比,本方法在日语-英语和法语-英语对齐任务中分别提升2.6%和0.4%,而在中文-英语对齐任务中略低0.2%。