This paper presents a comprehensive survey of research works on the topic of form understanding in the context of scanned documents. We delve into recent advancements and breakthroughs in the field, highlighting the significance of language models and transformers in solving this challenging task. Our research methodology involves an in-depth analysis of popular documents and forms of understanding of trends over the last decade, enabling us to offer valuable insights into the evolution of this domain. Focusing on cutting-edge models, we showcase how transformers have propelled the field forward, revolutionizing form-understanding techniques. Our exploration includes an extensive examination of state-of-the-art language models designed to effectively tackle the complexities of noisy scanned documents. Furthermore, we present an overview of the latest and most relevant datasets, which serve as essential benchmarks for evaluating the performance of selected models. By comparing and contrasting the capabilities of these models, we aim to provide researchers and practitioners with useful guidance in choosing the most suitable solutions for their specific form understanding tasks.
翻译:本文对扫描文档中的表格理解相关研究工作进行了全面综述。我们深入探讨了该领域的最新进展与突破,突出了语言模型和Transformer在解决这一具有挑战性任务中的重要性。研究方法包括深入分析过去十年来流行的文档与表格理解趋势,从而为该领域的发展演变提供宝贵见解。聚焦前沿模型,我们展示了Transformer如何推动该领域向前发展,革新表格理解技术。我们广泛考察了专为有效处理含噪扫描文档复杂性而设计的最新语言模型。此外,我们概述了最新且最相关的数据集,这些数据集是评估所选模型性能的关键基准。通过比较这些模型的能力,旨在为研究人员和从业者选择最适合其特定表格理解任务的解决方案提供有益指导。