Table structure recognition (TSR) aims to parse the inherent structure of a table from its input image. The `"split-and-merge" paradigm is a pivotal approach to parse table structure, where the table separation line detection is crucial. However, challenges such as wireless and deformed tables make it demanding. In this paper, we adhere to the "split-and-merge" paradigm and propose SEMv3 (SEM: Split, Embed and Merge), a method that is both fast and robust for detecting table separation lines. During the split stage, we introduce a Keypoint Offset Regression (KOR) module, which effectively detects table separation lines by directly regressing the offset of each line relative to its keypoint proposals. Moreover, in the merge stage, we define a series of merge actions to efficiently describe the table structure based on table grids. Extensive ablation studies demonstrate that our proposed KOR module can detect table separation lines quickly and accurately. Furthermore, on public datasets (e.g. WTW, ICDAR-2019 cTDaR Historical and iFLYTAB), SEMv3 achieves state-of-the-art (SOTA) performance. The code is available at https://github.com/Chunchunwumu/SEMv3.
翻译:表格结构识别(TSR)旨在从输入图像中解析表格的固有结构。"分-合"范式是解析表格结构的关键方法,其中表格分隔线检测至关重要。然而,无线表格和变形表格等挑战增加了检测难度。本文遵循"分-合"范式,提出SEMv3(SEM:分割、嵌入与合并)——一种快速鲁棒的表格分隔线检测方法。在分割阶段,我们引入关键点偏移回归(KOR)模块,该模块通过直接回归每条分隔线相对于其关键点提议的偏移量,有效检测表格分隔线。此外,在合并阶段,我们定义了一系列合并操作,基于表格网格高效描述表格结构。大量消融研究表明,所提出的KOR模块能够快速准确地检测表格分隔线。在公开数据集(如WTW、ICDAR-2019 cTDaR历史数据集和iFLYTAB)上,SEMv3实现了最先进的性能。代码开源地址:https://github.com/Chunchunwumu/SEMv3。