Semantic entity recognition is an important task in the field of visually-rich document understanding. It distinguishes the semantic types of text by analyzing the position relationship between text nodes and the relation between text content. The existing document understanding models mainly focus on entity categories while ignoring the extraction of entity boundaries. We build a novel hypergraph attention document semantic entity recognition framework, HGA, which uses hypergraph attention to focus on entity boundaries and entity categories at the same time. It can conduct a more detailed analysis of the document text representation analyzed by the upstream model and achieves a better performance of semantic information. We apply this method on the basis of GraphLayoutLM to construct a new semantic entity recognition model HGALayoutLM. Our experiment results on FUNSD, CORD, XFUND and SROIE show that our method can effectively improve the performance of semantic entity recognition tasks based on the original model. The results of HGALayoutLM on FUNSD and XFUND reach the new state-of-the-art results.
翻译:语义实体识别是视觉丰富文档理解领域的一项重要任务,它通过分析文本节点之间的位置关系以及文本内容之间的关系来区分文本的语义类型。现有的文档理解模型主要关注实体类别,而忽略了实体边界的提取。我们构建了一种新颖的超图注意力文档语义实体识别框架HGA,该框架利用超图注意力同时关注实体边界和实体类别。它能够对上游模型分析的文档文本表示进行更细致的分析,并获得更好的语义信息性能。我们在GraphLayoutLM的基础上应用该方法,构建了新的语义实体识别模型HGALayoutLM。我们在FUNSD、CORD、XFUND和SROIE上的实验结果表明,该方法能够在原始模型基础上有效提升语义实体识别任务的性能。HGALayoutLM在FUNSD和XFUND上的结果达到了新的最优水平。