Deep learning based pipelines for semantic segmentation often ignore structural information available on annotated images used for training. We propose a novel post-processing module enforcing structural knowledge about the objects of interest to improve segmentation results provided by deep learning. This module corresponds to a "many-to-one-or-none" inexact graph matching approach, and is formulated as a quadratic assignment problem. Our approach is compared to a CNN-based segmentation (for various CNN backbones) on two public datasets, one for face segmentation from 2D RGB images (FASSEG), and the other for brain segmentation from 3D MRIs (IBSR). Evaluations are performed using two types of structural information (distances and directional relations, , this choice being a hyper-parameter of our generic framework). On FASSEG data, results show that our module improves accuracy of the CNN by about 6.3% (the Hausdorff distance decreases from 22.11 to 20.71). On IBSR data, the improvement is of 51% (the Hausdorff distance decreases from 11.01 to 5.4). In addition, our approach is shown to be resilient to small training datasets that often limit the performance of deep learning methods: the improvement increases as the size of the training dataset decreases.
翻译:基于深度学习的语义分割流程常忽略训练标注图像中的结构信息。我们提出一种新型后处理模块,通过强制利用目标对象的先验结构知识来改进深度学习的分割结果。该模块采用"多对一或无"的非精确图匹配方法,被形式化为二次分配问题。我们在两个公开数据集(基于2D RGB图像的人脸分割数据集FASSEG和基于3D MRI的脑部分割数据集IBSR)上,将所提方法与多种CNN骨架的分割模型进行对比。实验采用两种结构信息(距离约束和方向关系,其选择作为框架超参数)进行评价。在FASSEG数据集上,该模块将CNN精度提升约6.3%(豪斯多夫距离从22.11降至20.71);在IBSR数据集上,提升幅度达51%(豪斯多夫距离从11.01降至5.4)。此外,本方法对限制深度学习方法性能的小规模训练集具有鲁棒性:随着训练集规模减小,性能提升幅度反而增大。