For problems in image processing and many other fields, a large class of effective neural networks has encoder-decoder-based architectures. Although these networks have made impressive performances, mathematical explanations of their architectures are still underdeveloped. In this paper, we study the encoder-decoder-based network architecture from the algorithmic perspective and provide a mathematical explanation. We use the two-phase Potts model for image segmentation as an example for our explanations. We associate the segmentation problem with a control problem in the continuous setting. Then, multigrid method and operator splitting scheme, the PottsMGNet, are used to discretize the continuous control model. We show that the resulting discrete PottsMGNet is equivalent to an encoder-decoder-based network. With minor modifications, it is shown that a number of the popular encoder-decoder-based neural networks are just instances of the proposed PottsMGNet. By incorporating the Soft-Threshold-Dynamics into the PottsMGNet as a regularizer, the PottsMGNet has shown to be robust with the network parameters such as network width and depth and achieved remarkable performance on datasets with very large noise. In nearly all our experiments, the new network always performs better or as good on accuracy and dice score than existing networks for image segmentation.
翻译:在图像处理及许多其他领域中,一大类有效的神经网络采用基于编码器-解码器的架构。尽管这些网络取得了令人瞩目的性能,但其架构的数学解释仍不完善。本文从算法角度研究基于编码器-解码器的网络架构,并给出数学解释。我们以用于图像分割的两相Potts模型为例进行说明。首先将分割问题与连续环境下的控制问题相关联,随后使用多重网格法和算子分裂方案(即PottsMGNet)对连续控制模型进行离散化。结果表明,离散化的PottsMGNet等价于一个基于编码器-解码器的网络。通过少量修改,许多流行的基于编码器-解码器的神经网络均可视为所提出的PottsMGNet的特例。通过将软阈值动力学作为正则化项融入PottsMGNet,该网络对网络宽度和深度等参数表现出鲁棒性,并在含强噪声数据集上取得显著性能。在几乎所有实验中,新网络在图像分割的准确率和Dice得分上均优于或等同于现有网络。