Convolutional neural networks (CNNs) are a popular choice of model for tasks in computer vision. When CNNs are made with many layers, resulting in a deep neural network, skip connections may be added to create an easier gradient optimization problem while retaining model expressiveness. In this paper, we show that arbitrarily complex, trained, linear CNNs with skip connections can be simplified into a single-layer model, resulting in greatly reduced computational requirements during prediction time. We also present a method for training nonlinear models with skip connections that are gradually removed throughout training, giving the benefits of skip connections without requiring computational overhead during during prediction time. These results are demonstrated with practical examples on Residual Networks (ResNet) architecture.
翻译:卷积神经网络(CNNs)是计算机视觉任务中常用的模型架构。当CNN层数增加形成深度神经网络时,常通过添加跳跃连接来在保持模型表达能力的同时简化梯度优化问题。本文证明,任意复杂、经过训练的带跳跃连接线性CNN均可简化为单层模型,从而显著降低预测阶段的计算需求。同时,我们提出一种训练非线性跳跃连接模型的方法:通过在训练过程中逐步移除跳跃连接,既保留了跳跃连接的优化优势,又避免了预测阶段的计算开销。这些结论在残差网络(ResNet)架构的实际案例中得到了验证。