Schema matching is a core data integration task, focusing on identifying correspondences among attributes of multiple schemata. Numerous algorithmic approaches were suggested for schema matching over the years, aiming at solving the task with as little human involvement as possible. Yet, humans are still required in the loop -- to validate algorithms and to produce ground truth data for algorithms to be trained against. In recent years, a new research direction investigates the capabilities and behavior of humans while performing matching tasks. Previous works utilized this knowledge to predict, and even improve, the performance of human matchers. In this work, we continue this line of research by suggesting a novel measure to evaluate the performance of human matchers, based on calibration, a common meta-cognition measure. The proposed measure enables detailed analysis of various factors of the behavior of human matchers and their relation to human performance. Such analysis can be further utilized to develop heuristics and methods to better asses and improve the annotation quality.
翻译:模式匹配是一项核心的数据集成任务,重点在于识别多个模式中属性之间的对应关系。多年来,研究者提出了众多算法方法来解决模式匹配问题,力求尽可能减少人工干预。然而,人类仍需参与其中——验证算法并生成用于算法训练的真实数据。近年来,一个新的研究方向开始探究人类在执行匹配任务时的能力与行为特征。先前的研究已利用这些知识来预测甚至提升人类匹配者的表现。本研究延续该方向,提出了一种基于校准这一常见元认知度量的新型评估方法,用于衡量人类匹配者的表现。该度量方法能够深入分析人类匹配者行为的多项因素及其与表现之间的关联。此类分析可进一步用于开发启发式策略与方法,从而更好地评估和提升标注质量。