Table detection is the task of classifying and localizing table objects within document images. With the recent development in deep learning methods, we observe remarkable success in table detection. However, a significant amount of labeled data is required to train these models effectively. Many semi-supervised approaches are introduced to mitigate the need for a substantial amount of label data. These approaches use CNN-based detectors that rely on anchor proposals and post-processing stages such as NMS. To tackle these limitations, this paper presents a novel end-to-end semi-supervised table detection method that employs the deformable transformer for detecting table objects. We evaluate our semi-supervised method on PubLayNet, DocBank, ICADR-19 and TableBank datasets, and it achieves superior performance compared to previous methods. It outperforms the fully supervised method (Deformable transformer) by +3.4 points on 10\% labels of TableBank-both dataset and the previous CNN-based semi-supervised approach (Soft Teacher) by +1.8 points on 10\% labels of PubLayNet dataset. We hope this work opens new possibilities towards semi-supervised and unsupervised table detection methods.
翻译:表格检测是对文档图像中的表格对象进行分类和定位的任务。随着深度学习方法的近期发展,我们在表格检测领域取得了显著成功。然而,有效训练这些模型需要大量标注数据。为减少对大量标签数据的需求,许多半监督方法被提出。这些方法通常依赖基于CNN的检测器,这些检测器依赖于锚点提议和非极大值抑制等后处理步骤。为解决这些局限性,本文提出了一种新颖的端到端半监督表格检测方法,采用可变形Transformer进行表格对象检测。我们在PubLayNet、DocBank、ICADR-19和TableBank数据集上评估了我们的半监督方法,相比先前方法取得了更优性能。在TableBank-both数据集的10%标签数据上,该方法相比全监督方法(可变形Transformer)提升了+3.4个百分点,在PubLayNet数据集的10%标签数据上,相比基于CNN的半监督方法(Soft Teacher)提升了+1.8个百分点。我们希望这项工作为半监督和无监督表格检测方法开辟了新的可能性。