Document Key Information Extraction (KIE) is a technology that transforms valuable information in document images into structured data, and it has become an essential function in industrial settings. However, current evaluation metrics of this technology do not accurately reflect the critical attributes of its industrial applications. In this paper, we present KIEval, a novel application-centric evaluation metric for Document KIE models. Unlike prior metrics, KIEval assesses Document KIE models not just on the extraction of individual information (entity) but also of the structured information (grouping). Evaluation of structured information provides assessment of Document KIE models that are more reflective of extracting grouped information from documents in industrial settings. Designed with industrial application in mind, we believe that KIEval can become a standard evaluation metric for developing or applying Document KIE models in practice. The code will be publicly available.
翻译:文档关键信息提取(KIE)是一项将文档图像中的有价值信息转换为结构化数据的技术,已成为工业场景中的关键功能。然而,该技术现有的评估指标未能准确反映其工业应用的核心特性。本文提出KIEval,一种面向文档KIE模型的新型以应用为中心的评估指标。与先前指标不同,KIEval不仅评估文档KIE模型对独立信息(实体)的提取能力,还评估其对结构化信息(分组)的提取能力。对结构化信息的评估能够更准确地反映工业场景中从文档提取分组信息的实际需求,从而为文档KIE模型提供更具实践意义的性能衡量。基于工业应用视角的设计理念,我们相信KIEval有望成为实践中开发或应用文档KIE模型的标准评估指标。相关代码将公开提供。