Table detection is the task of classifying and localizing table objects within document images. With the recent development in deep learning methods, we observe remarkable success in table detection. However, a significant amount of labeled data is required to train these models effectively. Many semi-supervised approaches are introduced to mitigate the need for a substantial amount of label data. These approaches use CNN-based detectors that rely on anchor proposals and post-processing stages such as NMS. To tackle these limitations, this paper presents a novel end-to-end semi-supervised table detection method that employs the deformable transformer for detecting table objects. We evaluate our semi-supervised method on PubLayNet, DocBank, ICADR-19 and TableBank datasets, and it achieves superior performance compared to previous methods. It outperforms the fully supervised method (Deformable transformer) by +3.4 points on 10\% labels of TableBank-both dataset and the previous CNN-based semi-supervised approach (Soft Teacher) by +1.8 points on 10\% labels of PubLayNet dataset. We hope this work opens new possibilities towards semi-supervised and unsupervised table detection methods.
翻译:表格检测是对文档图像中的表格对象进行分类和定位的任务。随着深度学习方法的最新发展,我们在表格检测中取得了显著成功。然而,有效训练这些模型需要大量标注数据。为了减少对大量标注数据的需求,许多半监督方法被引入。这些方法通常使用基于CNN的检测器,依赖锚点建议和非极大值抑制(NMS)等后处理阶段。为解决这些局限性,本文提出了一种新颖的端到端半监督表格检测方法,采用可变形Transformer来检测表格对象。我们在PubLayNet、DocBank、ICDAR-19和TableBank数据集上评估了我们的半监督方法,其性能优于先前方法。在TableBank-both数据集使用10%标签的情况下,它超过全监督方法(可变形Transformer)3.4个点;在PubLayNet数据集使用10%标签的情况下,它超过先前基于CNN的半监督方法(Soft Teacher)1.8个点。我们希望这项工作为半监督和无监督表格检测方法开辟了新的可能性。