For problems in image processing and many other fields, a large class of effective neural networks has encoder-decoder-based architectures. Although these networks have made impressive performances, mathematical explanations of their architectures are still underdeveloped. In this paper, we study the encoder-decoder-based network architecture from the algorithmic perspective and provide a mathematical explanation. We use the two-phase Potts model for image segmentation as an example for our explanations. We associate the segmentation problem with a control problem in the continuous setting. Then, multigrid method and operator splitting scheme, the PottsMGNet, are used to discretize the continuous control model. We show that the resulting discrete PottsMGNet is equivalent to an encoder-decoder-based network. With minor modifications, it is shown that a number of the popular encoder-decoder-based neural networks are just instances of the proposed PottsMGNet. By incorporating the Soft-Threshold-Dynamics into the PottsMGNet as a regularizer, the PottsMGNet has shown to be robust with the network parameters such as network width and depth and achieved remarkable performance on datasets with very large noise. In nearly all our experiments, the new network always performs better or as good on accuracy and dice score than existing networks for image segmentation.
翻译:对于图像处理及许多其他领域中的问题,一类高效的神经网络均采用基于编码器-解码器的架构。尽管这类网络已展现出令人瞩目的性能,但其架构的数学解释仍不完善。本文从算法视角出发,研究基于编码器-解码器的网络架构,并给出数学解释。我们以用于图像分割的两阶段Potts模型为例进行阐释,将分割问题与连续设定下的控制问题相关联。进而采用多重网格方法和算子分裂格式(即PottsMGNet)对连续控制模型进行离散化,并证明离散后的PottsMGNet等价于基于编码器-解码器的网络。通过微调,我们发现若干流行的基于编码器-解码器的神经网络仅是本文所提出PottsMGNet的实例。将软阈值动力学作为正则化项引入PottsMGNet后,该网络对网络宽度与深度等参数展现出鲁棒性,并在含极大噪声的数据集上取得显著性能。在几乎所有实验中,新网络在图像分割的准确率和Dice得分上均优于或持平现有网络。