Uncertainty in Automated Ontology Matching: Lessons Learned from an Empirical Experimentation

Data integration is considered a classic research field and a pressing need within the information science community. Ontologies play a critical role in such a process by providing well-consolidated support to link and semantically integrate datasets via interoperability. This paper approaches data integration from an application perspective, looking at techniques based on ontology matching. An ontology-based process may only be considered adequate by assuming manual matching of different sources of information. However, since the approach becomes unrealistic once the system scales up, automation of the matching process becomes a compelling need. Therefore, we have conducted experiments on actual data with the support of existing tools for automatic ontology matching from the scientific community. Even considering a relatively simple case study (i.e., the spatio-temporal alignment of global indicators), outcomes clearly show significant uncertainty resulting from errors and inaccuracies along the automated matching process. More concretely, this paper aims to test on real-world data a bottom-up knowledge-building approach, discuss the lessons learned from the experimental results of the case study, and draw conclusions about uncertainty and uncertainty management in an automated ontology matching process. While the most common evaluation metrics clearly demonstrate the unreliability of fully automated matching solutions, properly designed semi-supervised approaches seem to be mature for a more generalized application.

翻译：数据整合被认为是信息科学领域中的一个经典研究领域，同时也是紧迫需求。本体在此过程中发挥关键作用，通过互操作性提供充分支持，以链接并语义整合数据集。本文从应用视角探讨数据整合，关注基于本体匹配的技术。基于本体的过程只有在假设人工匹配不同信息来源时才被认为是充分的。然而，随着系统规模扩大，这种方法变得不切实际，因此匹配过程的自动化成为迫切需求。为此，我们利用科学界现有的自动本体匹配工具，在实际数据上进行了实验。即使考虑相对简单的案例研究（即全球指标的时空对齐），结果也明确显示出自动化匹配过程中因错误与不准确性而导致的显著不确定性。更具体地说，本文旨在以真实世界数据测试自下而上的知识构建方法，讨论案例研究实验结果的教训，并得出关于自动化本体匹配过程中不确定性与不确定性管理的结论。虽然最常用的评估指标明确表明全自动化匹配解决方案不可靠，但设计合理的半监督方法似乎已成熟到可进行更广泛的应用。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日