Deep neural networks (DNNs) have provided brilliant performance across various tasks. However, this success often comes at the cost of unnecessarily large model sizes, high computational demands, and substantial memory footprints. Typically, powerful architectures are trained at full depths but not all datasets or tasks require such high model capacity. Training very deep architectures on relatively low-complexity datasets frequently leads to wasted computation, unnecessary energy consumption, and excessive memory usage, which in turn makes deployment of models on resource-constrained devices impractical. To address this problem, we introduce Optimally Deep Networks (ODNs), which provide a balance between model depth and task complexity. Specifically, we propose a NAS like training strategy called progressive depth expansion, which begins by training deep networks at shallower depths and incrementally increases their depth as the earlier blocks converge, continuing this process until the target accuracy is reached. ODNs use only the optimal depth for the given datasets, removing redundant layers. This cuts down future training and inference costs, lowers the memory footprint, enhances computational efficiency, and facilitates deployment on edge devices. Empirical results show that the optimal depths of ResNet-18 and ResNet-34 for MNIST and SVHN, achieve up to 98.64 % and 96.44 % reduction in memory footprint, while maintaining a competitive accuracy of 99.31 % and 96.08 %, respectively.
翻译:深度神经网络(DNNs)在各种任务中展现出卓越性能。然而,这种成功往往伴随着不必要的庞大模型规模、高计算需求及显著的内存占用。通常,强大的架构会以完整深度进行训练,但并非所有数据集或任务都需要如此高的模型容量。在复杂度相对较低的数据集上训练极深的架构,常常导致计算资源浪费、不必要的能耗以及过度的内存使用,进而使得在资源受限设备上部署模型变得不切实际。为解决这一问题,我们提出了最优深度网络(ODNs),它在模型深度与任务复杂度之间实现了平衡。具体而言,我们提出了一种类似神经架构搜索(NAS)的训练策略,称为渐进深度扩展。该策略从训练较浅深度的网络开始,随着早期模块收敛逐步增加其深度,并持续此过程直至达到目标精度。ODNs仅使用针对给定数据集的最优深度,移除了冗余层。这降低了后续训练和推理成本,减少了内存占用,提升了计算效率,并促进了在边缘设备上的部署。实验结果表明,ResNet-18和ResNet-34在MNIST和SVHN数据集上的最优深度,分别实现了高达98.64%和96.44%的内存占用减少,同时保持了99.31%和96.08%的竞争性准确率。