Training deep Convolutional Neural Networks (CNNs) presents unique challenges, including the pervasive issue of elimination singularities, consistent deactivation of nodes leading to degenerate manifolds within the loss landscape. These singularities impede efficient learning by disrupting feature propagation. To mitigate this, we introduce Pool Skip, an architectural enhancement that strategically combines a Max Pooling, a Max Unpooling, a 3 times 3 convolution, and a skip connection. This configuration helps stabilize the training process and maintain feature integrity across layers. We also propose the Weight Inertia hypothesis, which underpins the development of Pool Skip, providing theoretical insights into mitigating degradation caused by elimination singularities through dimensional and affine compensation. We evaluate our method on a variety of benchmarks, focusing on both 2D natural and 3D medical imaging applications, including tasks such as classification and segmentation. Our findings highlight Pool Skip's effectiveness in facilitating more robust CNN training and improving model performance.
翻译:训练深度卷积神经网络(CNN)面临独特挑战,其中普遍存在的消除奇点问题——节点的持续失活导致损失函数空间中出现退化流形——尤为突出。这些奇点通过破坏特征传播阻碍高效学习。为缓解此问题,我们提出一种架构增强方法Pool Skip,该方法策略性地结合了最大池化、最大反池化、3×3卷积和跳跃连接。此配置有助于稳定训练过程并保持跨层特征完整性。我们还提出了支撑Pool Skip开发的权重惯性假说,为通过维度和仿射补偿缓解消除奇点引起的退化提供了理论依据。我们在多种基准测试上评估了该方法,重点关注2D自然图像和3D医学影像应用,包括分类和分割等任务。研究结果突显了Pool Skip在促进更鲁棒的CNN训练和提升模型性能方面的有效性。