Linear objects convey substantial information about document structure, but are challenging to detect accurately because of degradation (curved, erased) or decoration (doubled, dashed). Many approaches can recover some vector representation, but only one closed-source technique introduced in 1994, based on Kalman filters (a particular case of Multiple Object Tracking algorithm), can perform a pixel-accurate instance segmentation of linear objects and enable to selectively remove them from the original image. We aim at re-popularizing this approach and propose: 1. a framework for accurate instance segmentation of linear objects in document images using Multiple Object Tracking (MOT); 2. document image datasets and metrics which enable both vector- and pixel-based evaluation of linear object detection; 3. performance measures of MOT approaches against modern segment detectors; 4. performance measures of various tracking strategies, exhibiting alternatives to the original Kalman filters approach; and 5. an open-source implementation of a detector which can discriminate instances of curved, erased, dashed, intersecting and/or overlapping linear objects.
翻译:线性目标承载了文档结构的重要信息,但由于退化(弯曲、擦除)或装饰(双线、虚线)等因素,其精确检测极具挑战性。现有多种方法可恢复部分向量表示,但仅有一种基于卡尔曼滤波器(多目标跟踪算法的特例)且于1994年提出的闭源技术能实现线性目标的像素级实例分割,并支持从原始图像中选择性移除。我们旨在重新推广该方法,并提出:1. 一种利用多目标跟踪(MOT)实现文档图像中线性目标精确实例分割的框架;2. 支持向量与像素双维度评估线性目标检测的文档图像数据集及指标;3. MOT方法与现代分割检测器的性能对比;4. 多种跟踪策略(包括替代原始卡尔曼滤波器的方案)的性能评估;5. 一个可区分弯曲、擦除、虚线、交叉及重叠线性目标实例的开源检测器实现。