Adapters are a parameter-efficient alternative to fine-tuning, which augment a frozen base network to learn new tasks. Yet, the inference of the adapted model is often slower than the corresponding fine-tuned model. To improve on this, we propose Structured Pruning Adapters (SPAs), a family of compressing, task-switching network adapters, that accelerate and specialize networks using tiny parameter sets and structured pruning. Specifically, we propose a channel-based SPA and evaluate it with a suite of pruning methods on multiple computer vision benchmarks. Compared to regular structured pruning with fine-tuning, our channel-SPAs improve accuracy by 6.9% on average while using half the parameters at 90% pruned weights. Alternatively, they can learn adaptations with 17x fewer parameters at 70% pruning with 1.6% lower accuracy. Similarly, our block-SPA requires far fewer parameters than pruning with fine-tuning. Our experimental code and Python library of adapters are available at github.com/lukashedegaard/structured-pruning-adapters.
翻译:适配器是一种参数高效的微调替代方案,通过增强冻结的基础网络来学习新任务。然而,适配后模型的推理速度通常慢于对应的微调模型。为改善这一问题,我们提出结构化剪枝适配器(Structured Pruning Adapters,SPAs)——一类兼具压缩与任务切换功能的网络适配器,可利用极小的参数集和结构化剪枝技术加速并专化网络。具体而言,我们提出基于通道的SPA,并在多个计算机视觉基准上采用剪枝方法套件进行评估。与常规结构化剪枝结合微调的方法相比,我们的通道SPA在剪枝90%权重时,以仅需半数的参数实现平均准确率提升6.9%;或在剪枝70%时,以17倍更少的参数实现准确率仅降低1.6%。类似地,我们的块状SPA比剪枝加微调方法需要更少的参数。实验代码及适配器Python库已开源至github.com/lukashedegaard/structured-pruning-adapters。