Binary semantic segmentation in computer vision is a fundamental problem. As a model-based segmentation method, the graph-cut approach was one of the most successful binary segmentation methods thanks to its global optimality guarantee of the solutions and its practical polynomial-time complexity. Recently, many deep learning (DL) based methods have been developed for this task and yielded remarkable performance, resulting in a paradigm shift in this field. To combine the strengths of both approaches, we propose in this study to integrate the graph-cut approach into a deep learning network for end-to-end learning. Unfortunately, backward propagation through the graph-cut module in the DL network is challenging due to the combinatorial nature of the graph-cut algorithm. To tackle this challenge, we propose a novel residual graph-cut loss and a quasi-residual connection, enabling the backward propagation of the gradients of the residual graph-cut loss for effective feature learning guided by the graph-cut segmentation model. In the inference phase, globally optimal segmentation is achieved with respect to the graph-cut energy defined on the optimized image features learned from DL networks. Experiments on the public AZH chronic wound data set and the pancreas cancer data set from the medical segmentation decathlon (MSD) demonstrated promising segmentation accuracy, and improved robustness against adversarial attacks.
翻译:计算机视觉中的二值语义分割是一个基础性问题。作为一种基于模型的分割方法,图割方法凭借其解的全局最优性保证以及实际多项式时间复杂度的优势,曾是二值分割领域最成功的技术之一。近年来,基于深度学习的方法在该任务中取得了显著性能,并推动了该领域的范式转变。为融合两类方法的优势,本研究提出将图割方法集成到深度学习网络中实现端到端学习。然而,由于图割算法的组合优化特性,在深度学习网络中进行图割模块的反向传播面临挑战。为解决该问题,我们提出新型残差图割损失与准残差连接机制,使得残差图割损失的梯度能够反向传播,从而实现由图割分割模型引导的有效特征学习。在推理阶段,基于深度学习网络学习的最优图像特征所定义的图割能量,即可实现全局最优分割。在公共AZH慢性伤口数据集和医学分割十项全能(MSD)胰腺癌数据集上的实验表明,该方法具有出色的分割精度,并显著提升了对抗攻击的鲁棒性。