Ontology matching (OM) plays an essential role in enabling semantic interoperability and integration across heterogeneous knowledge sources, particularly in the biomedical domain which contains numerous complex concepts related to diseases and pharmaceuticals. This paper introduces GenOM, a large language model (LLM)-based ontology alignment framework, which enriches the semantic representations of ontology concepts via generating textual definitions, retrieves alignment candidates with an embedding model, and incorporates exact matching-based tools to improve precision. Extensive experiments conducted on the OAEI Bio-ML track demonstrate that GenOM can often achieve competitive performance, surpassing many baselines including traditional OM systems and recent LLM-based methods. Further ablation studies confirm the effectiveness of semantic enrichment and few-shot prompting, highlighting the framework's robustness and adaptability.
翻译:本体匹配(OM)在实现异构知识源的语义互操作与集成中发挥着关键作用,尤其在包含大量与疾病和药物相关的复杂概念的生物医学领域。本文提出GenOM——一种基于大语言模型(LLM)的本体对齐框架。该框架通过生成文本定义来丰富本体概念的语义表征,利用嵌入模型检索对齐候选,并集成基于精确匹配的工具以提升精度。在OAEI Bio-ML基准上开展的大量实验表明,GenOM常能取得具有竞争力的性能,超越包括传统OM系统及近期基于LLM的方法在内的诸多基线。进一步的消融研究证实了语义增强与少样本提示的有效性,凸显了该框架的鲁棒性与适应性。