Ontology matching (OM) entails the identification of semantic relationships between concepts within two or more knowledge graphs (KGs) and serves as a critical step in integrating KGs from various sources. Recent advancements in deep OM models have harnessed the power of transformer-based language models and the advantages of knowledge graph embedding. Nevertheless, these OM models still face persistent challenges, such as a lack of reference alignments, runtime latency, and unexplored different graph structures within an end-to-end framework. In this study, we introduce a novel self-supervised learning OM framework with input ontologies, called LaKERMap. This framework capitalizes on the contextual and structural information of concepts by integrating implicit knowledge into transformers. Specifically, we aim to capture multiple structural contexts, encompassing both local and global interactions, by employing distinct training objectives. To assess our methods, we utilize the Bio-ML datasets and tasks. The findings from our innovative approach reveal that LaKERMap surpasses state-of-the-art systems in terms of alignment quality and inference time. Our models and codes are available here: https://github.com/ellenzhuwang/lakermap.
翻译:本体匹配(OM)旨在识别两个或多个知识图谱中概念之间的语义关系,是整合来自不同来源的知识图谱的关键步骤。近期深度OM模型的进展利用了基于Transformer的语言模型以及知识图谱嵌入的优势。然而,这些OM模型仍面临持续挑战,例如缺乏参考对齐、运行时延迟以及端到端框架中未充分探索的不同图结构。本研究提出了一种新颖的基于输入本体的自监督学习OM框架,称为LaKERMap。该框架通过将隐式知识整合到Transformer中,充分利用概念的上下文和结构信息。具体而言,我们通过采用不同的训练目标,致力于捕获涵盖局部和全局交互的多重结构上下文。为评估我们的方法,我们使用了Bio-ML数据集和任务。创新方法的结果表明,LaKERMap在对齐质量和推理时间方面均超越了现有最先进系统。我们的模型和代码可在此处获取:https://github.com/ellenzhuwang/lakermap。