Document Key Information Extraction (KIE) is a technology that transforms valuable information in document images into structured data, and it has become an essential function in industrial settings. However, current evaluation metrics of this technology do not accurately reflect the critical attributes of its industrial applications. In this paper, we present KIEval, a novel application-centric evaluation metric for Document KIE models. Unlike prior metrics, KIEval assesses Document KIE models not just on the extraction of individual information (entity) but also of the structured information (grouping). Evaluation of structured information provides assessment of Document KIE models that are more reflective of extracting grouped information from documents in industrial settings. Designed with industrial application in mind, we believe that KIEval can become a standard evaluation metric for developing or applying Document KIE models in practice. The code will be publicly available.
翻译:文档关键信息提取(KIE)是一种将文档图像中的有价值信息转化为结构化数据的技术,现已成为工业场景中的核心功能。然而,该技术现有的评估指标未能准确反映其工业应用的关键特性。本文提出KIEval——一种面向文档KIE模型、以应用为中心的新型评估指标。与既有指标不同,KIEval不仅评估文档KIE模型对独立信息(实体)的提取能力,还评估其对结构化信息(分组)的提取能力。对结构化信息的评估使文档KIE模型的评测更贴合工业场景中从文档提取分组信息的实际需求。KIEval以工业应用为导向进行设计,我们相信其有望成为实践中开发或应用文档KIE模型的标准评估指标。相关代码将公开提供。