Information Extraction processes in handwritten documents tend to rely on obtaining an automatic transcription and performing Named Entity Recognition (NER) over such transcription. For this reason, in publicly available datasets, the performance of the systems is usually evaluated with metrics particular to each dataset. Moreover, most of the metrics employed are sensitive to reading order errors. Therefore, they do not reflect the expected final application of the system and introduce biases in more complex documents. In this paper, we propose and publicly release a set of reading order independent metrics tailored to Information Extraction evaluation in handwritten documents. In our experimentation, we perform an in-depth analysis of the behavior of the metrics to recommend what we consider to be the minimal set of metrics to evaluate a task correctly.
翻译:手写文档中的信息提取过程通常依赖于获取自动转录结果,并在此转录结果上执行命名实体识别。正因如此,在公开数据集中,系统性能通常使用各数据集特有的指标进行评估。此外,大多数采用的指标对阅读顺序错误敏感,因此无法反映系统的预期最终应用场景,并在更复杂的文档中引入偏差。本文提出并公开发布一套专为手写文档信息提取评估设计的、与阅读顺序无关的指标集。通过实验,我们深入分析了各指标的行为特性,并推荐了我们认为正确评估任务所需的最小指标集合。