This paper proposes a new method to improve the training efficiency of deep convolutional neural networks. During training, the method evaluates scores to measure how much each layer's parameters change and whether the layer will continue learning or not. Based on these scores, the network is scaled down such that the number of parameters to be learned is reduced, yielding a speed up in training. Unlike state-of-the-art methods that try to compress the network to be used in the inference phase or to limit the number of operations performed in the backpropagation phase, the proposed method is novel in that it focuses on reducing the number of operations performed by the network in the forward propagation during training. The proposed training strategy has been validated on two widely used architecture families: VGG and ResNet. Experiments on MNIST, CIFAR-10 and Imagenette show that, with the proposed method, the training time of the models is more than halved without significantly impacting accuracy. The FLOPs reduction in the forward propagation during training ranges from 17.83\% for VGG-11 to 83.74\% for ResNet-152. These results demonstrate the effectiveness of the proposed technique in speeding up learning of CNNs. The technique will be especially useful in applications where fine-tuning or online training of convolutional models is required, for instance because data arrive sequentially.
翻译:本文提出了一种提升深度卷积神经网络训练效率的新方法。在训练过程中,该方法通过评估得分来度量每层参数的变化程度,并判断该层是否应继续学习。基于这些得分,网络被动态缩减,从而减少需要学习的参数数量,实现训练加速。与现有方法(如推理阶段网络压缩或限制反向传播计算量)不同,本文提出的方法创新性地聚焦于减少训练中前向传播的计算量。该训练策略已在两种广泛使用的模型架构族(VGG和ResNet)上得到验证。在MNIST、CIFAR-10和Imagenette数据集上的实验表明,采用本方法后,模型训练时间减少过半,且准确率基本不受影响。训练过程中前向传播的FLOPs降幅从VGG-11的17.83%到ResNet-152的83.74%不等。这些结果证明了所提技术在加速CNN学习方面的有效性。当需要微调或在线训练卷积模型(如因数据流式到达)时,该技术将具有特别重要的应用价值。