Large-scale pre-trained models have been remarkably successful in resolving downstream tasks. Nonetheless, deploying these models on low-capability devices still requires an effective approach, such as model pruning. However, pruning the model from scratch can pose a practical challenge given the limited resources of each downstream task or device. To tackle this issue, we present a scalable one-shot pruning method that leverages pruned knowledge of similar tasks to extract a sub-network from the pre-trained model for a new task. Specifically, we create a score mask using the pruned models of similar tasks to identify task-specific filters/nodes in the pre-trained model for the new task. Based on this mask, we conduct a single round of pruning to extract a suitably-sized sub-network that can quickly adapt to the new task with only a few training iterations. Our experimental analysis demonstrates the effectiveness of the proposed method on the convolutional neural networks (CNNs) and vision transformers (ViT) with various datasets. The proposed method consistently outperforms popular pruning baseline methods in terms of accuracy and efficiency when dealing with diverse downstream tasks with different memory constraints.
翻译:大规模预训练模型在解决下游任务方面取得了显著成功。然而,将这些模型部署到低性能设备上仍需借助有效手段,例如模型剪枝。但考虑到每个下游任务或设备的资源有限,从头开始剪枝模型会带来实际挑战。为解决此问题,我们提出了一种可扩展的"一次性剪枝"方法,该方法利用相似任务的剪枝知识,从预训练模型中为新任务提取子网络。具体而言,我们利用相似任务的剪枝模型创建评分掩码,以识别预训练模型中适用于新任务的特定任务滤波器/节点。基于该掩码,我们执行单轮剪枝,提取出尺寸合适的子网络,该网络仅需少量训练迭代即可快速适应新任务。实验分析表明,所提方法在卷积神经网络(CNN)和视觉Transformer(ViT)上对多种数据集均有效。在处理不同内存约束的多样化下游任务时,该方法在准确性和效率上始终优于主流剪枝基线方法。