In this paper, a new perspective is suggested for unsupervised Ontology Matching (OM) or Ontology Alignment (OA) by treating it as a translation task. Ontologies are represented as graphs, and the translation is performed from a node in the source ontology graph to a path in the target ontology graph. The proposed framework, Truveta Mapper (TM), leverages a multi-task sequence-to-sequence transformer model to perform alignment across multiple ontologies in a zero-shot, unified and end-to-end manner. Multi-tasking enables the model to implicitly learn the relationship between different ontologies via transfer-learning without requiring any explicit cross-ontology manually labeled data. This also enables the formulated framework to outperform existing solutions for both runtime latency and alignment quality. The model is pre-trained and fine-tuned only on publicly available text corpus and inner-ontologies data. The proposed solution outperforms state-of-the-art approaches, Edit-Similarity, LogMap, AML, BERTMap, and the recently presented new OM frameworks in Ontology Alignment Evaluation Initiative (OAEI22), offers log-linear complexity, and overall makes the OM task efficient and more straightforward without much post-processing involving mapping extension or mapping repair. We are open sourcing our solution.
翻译:本文提出了一种将无监督本体匹配(OM)或本体对齐(OA)视为翻译任务的新视角。本体以图形式表示,翻译过程从源本体图中的节点映射到目标本体图中的路径。所提出的框架Truveta Mapper(TM)利用多任务序列到序列变换器模型,以零样本、统一且端到端的方式跨多个本体执行对齐。多任务能力使得模型能够通过迁移学习隐式学习不同本体之间的关系,无需任何显式的跨本体人工标注数据。这还使得所构建的框架在运行延迟和对齐质量方面均优于现有解决方案。该模型仅在公开文本语料库和本体内部数据上进行预训练与微调。所提出的方案在性能上超越了最新方法——包括Edit-Similarity、LogMap、AML、BERTMap以及本体对齐评估倡议(OAEI22)中近期提出的新型OM框架,具备对数线性复杂度,并且整体上使OM任务高效且更直接,无需大量涉及映射扩展或映射修复的后处理工作。我们将开源本解决方案。