Pruning-quantization joint learning always facilitates the deployment of deep neural networks (DNNs) on resource-constrained edge devices. However, most existing methods do not jointly learn a global criterion for pruning and quantization in an interpretable way. In this paper, we propose a novel physics inspired criterion for pruning-quantization joint learning (PIC-PQ), which is explored from an analogy we first draw between elasticity dynamics (ED) and model compression (MC). Specifically, derived from Hooke's law in ED, we establish a linear relationship between the filters' importance distribution and the filter property (FP) by a learnable deformation scale in the physics inspired criterion (PIC). Furthermore, we extend PIC with a relative shift variable for a global view. To ensure feasibility and flexibility, available maximum bitwidth and penalty factor are introduced in quantization bitwidth assignment. Experiments on benchmarks of image classification demonstrate that PIC-PQ yields a good trade-off between accuracy and bit-operations (BOPs) compression ratio e.g., 54.96X BOPs compression ratio in ResNet56 on CIFAR10 with 0.10% accuracy drop and 53.24X in ResNet18 on ImageNet with 0.61% accuracy drop). The code will be available at https://github.com/fanxxxxyi/PIC-PQ.
翻译:剪枝-量化联合学习始终有助于将深度神经网络部署在资源受限的边缘设备上。然而,现有方法大多未能以可解释的方式联合学习适用于剪枝与量化的全局准则。本文提出一种新颖的物理启发式剪枝-量化联合学习准则,该准则源于我们首次建立的弹性动力学与模型压缩之间的类比关系。具体而言,基于弹性动力学中的胡克定律,我们通过物理启发准则中的可学习形变尺度,建立了滤波器重要性分布与滤波器特性之间的线性关系。此外,我们引入相对位移变量对准则进行扩展,以获得全局视角。为确保可行性与灵活性,我们在量化比特分配中引入了可用最大比特宽度与惩罚因子。在图像分类基准测试上的实验表明,所提方法在精度与比特操作数压缩比之间取得了良好平衡,例如在CIFAR10数据集上的ResNet56实现了54.96倍BOPs压缩比且精度仅下降0.10%,在ImageNet数据集上的ResNet18实现了53.24倍BOPs压缩比且精度下降0.61%。代码将在https://github.com/fanxxxxyi/PIC-PQ开源。