The existing model compression methods via structured pruning typically require complicated multi-stage procedures. Each individual stage necessitates numerous engineering efforts and domain-knowledge from the end-users which prevent their wider applications onto broader scenarios. We propose the second generation of Only-Train-Once (OTOv2), which first automatically trains and compresses a general DNN only once from scratch to produce a more compact model with competitive performance without fine-tuning. OTOv2 is automatic and pluggable into various deep learning applications, and requires almost minimal engineering efforts from the users. Methodologically, OTOv2 proposes two major improvements: (i) Autonomy: automatically exploits the dependency of general DNNs, partitions the trainable variables into Zero-Invariant Groups (ZIGs), and constructs the compressed model; and (ii) Dual Half-Space Projected Gradient (DHSPG): a novel optimizer to more reliably solve structured-sparsity problems. Numerically, we demonstrate the generality and autonomy of OTOv2 on a variety of model architectures such as VGG, ResNet, CARN, ConvNeXt, DenseNet and StackedUnets, the majority of which cannot be handled by other methods without extensive handcrafting efforts. Together with benchmark datasets including CIFAR10/100, DIV2K, Fashion-MNIST, SVNH and ImageNet, its effectiveness is validated by performing competitively or even better than the state-of-the-arts. The source code is available at https://github.com/tianyic/only_train_once.
翻译:现有基于结构化剪枝的模型压缩方法通常需要复杂的多阶段流程,每个阶段都要求用户具备大量工程经验和领域知识,这限制了其在更广场景中的应用。我们提出第二代"仅需一次训练"方法(OTOv2),该方法首次实现从零开始自动训练并压缩通用深度神经网络,仅需一次训练即可生成性能竞争力更强且无需微调的紧凑模型。OTOv2具有自动化和即插即用特性,可适配多种深度学习应用,几乎无需用户投入工程成本。在方法论上,OTOv2主要提出两项改进:(1)自主性:自动挖掘通用深度神经网络的依赖关系,将可训练变量划分为零不变群(ZIGs),并构建压缩模型;(2)双半空间投影梯度法(DHSPG):一种能够更可靠求解结构化稀疏问题的新型优化器。数值实验表明,OTOv2在VGG、ResNet、CARN、ConvNeXt、DenseNet和StackedUnets等多种模型架构上展现出通用性与自主性——而大多数这些架构需依赖大量人工处理才能被其他方法支持。结合CIFAR10/100、DIV2K、Fashion-MNIST、SVNH和ImageNet等基准数据集验证,OTOv2的性能与现有最优方法持平甚至更优。源代码见https://github.com/tianyic/only_train_once。