The task of entity alignment between knowledge graphs (KGs) aims to identify every pair of entities from two different KGs that represent the same entity. Many machine learning-based methods have been proposed for this task. However, to our best knowledge, existing methods all require manually crafted seed alignments, which are expensive to obtain. In this paper, we propose the first fully automatic alignment method named AutoAlign, which does not require any manually crafted seed alignments. Specifically, for predicate embeddings, AutoAlign constructs a predicate-proximity-graph with the help of large language models to automatically capture the similarity between predicates across two KGs. For entity embeddings, AutoAlign first computes the entity embeddings of each KG independently using TransE, and then shifts the two KGs' entity embeddings into the same vector space by computing the similarity between entities based on their attributes. Thus, both predicate alignment and entity alignment can be done without manually crafted seed alignments. AutoAlign is not only fully automatic, but also highly effective. Experiments using real-world KGs show that AutoAlign improves the performance of entity alignment significantly compared to state-of-the-art methods.
翻译:知识图谱间的实体对齐任务旨在识别来自两个不同知识图谱中表示同一实体的每一对实体。已有许多基于机器学习的方法被提出用于该任务。然而,据我们所知,现有方法均需人工构建种子对齐,这代价高昂。本文提出了首个完全自动化的对齐方法AutoAlign,该方法无需任何人工构建的种子对齐。具体地,对于谓词嵌入,AutoAlign借助大语言模型构建谓词邻近图,以自动捕获两个知识图谱间谓词的相似性。对于实体嵌入,AutoAlign首先使用TransE独立计算每个知识图谱的实体嵌入,然后通过基于实体属性计算相似性,将两个知识图谱的实体嵌入迁移至同一向量空间。因此,谓词对齐与实体对齐均无需人工构建种子对齐即可完成。AutoAlign不仅完全自动化,且具有高效性。基于真实知识图谱的实验表明,与最先进方法相比,AutoAlign显著提升了实体对齐性能。