Ontology matching (OM) plays an essential role in enabling semantic interoperability and integration across heterogeneous knowledge sources, particularly in the biomedical domain which contains numerous complex concepts related to diseases and pharmaceuticals. This paper introduces GenOM, a large language model (LLM)-based ontology alignment framework, which enriches the semantic representations of ontology concepts via generating textual definitions, retrieves alignment candidates with an embedding model, and incorporates exact matching-based tools to improve precision. Extensive experiments conducted on the OAEI Bio-ML track demonstrate that GenOM can often achieve competitive performance, surpassing many baselines including traditional OM systems and recent LLM-based methods. Further ablation studies confirm the effectiveness of semantic enrichment and few-shot prompting, highlighting the framework's robustness and adaptability.
翻译:本体匹配(Ontology Matching, OM)在实现异构知识源间的语义互操作与集成方面发挥着关键作用,尤其在包含大量疾病与药物相关复杂概念的生物医学领域。本文提出GenOM,一种基于大语言模型(LLM)的本体对齐框架,该框架通过生成文本定义来丰富本体概念的语义表示,利用嵌入模型检索对齐候选,并整合基于精确匹配的工具以提高精度。在OAEI Bio-ML数据集上进行的大量实验表明,GenOM通常能够取得具有竞争力的性能,超越了包括传统OM系统和近期基于LLM的方法在内的多种基线。进一步的消融研究证实了语义增强与少样本提示的有效性,凸显了该框架的鲁棒性与适应性。