Deep neural networks have delivered remarkable performance and have been widely used in various visual tasks. However, their huge size causes significant inconvenience for transmission and storage. Many previous studies have explored model size compression. However, these studies often approach various lossy and lossless compression methods in isolation, leading to challenges in achieving high compression ratios efficiently. This work proposes a post-training model size compression method that combines lossy and lossless compression in a unified way. We first propose a unified parametric weight transformation, which ensures different lossy compression methods can be performed jointly in a post-training manner. Then, a dedicated differentiable counter is introduced to guide the optimization of lossy compression to arrive at a more suitable point for later lossless compression. Additionally, our method can easily control a desired global compression ratio and allocate adaptive ratios for different layers. Finally, our method can achieve a stable $10\times$ compression ratio without sacrificing accuracy and a $20\times$ compression ratio with minor accuracy loss in a short time. Our code is available at https://github.com/ModelTC/L2_Compression .
翻译:深度神经网络在各类视觉任务中展现了卓越性能,并得到广泛应用。然而,其庞大的体积给传输和存储带来了显著不便。以往研究探索了多种模型尺寸压缩方法,但这些工作往往孤立地处理有损和无损压缩方法,难以高效实现高压缩比。本文提出一种统一的训练后模型尺寸压缩方法,将无损与有损压缩有机结合。首先,我们提出统一参数化权重变换,确保不同有损压缩方法能在训练后阶段联合执行。其次,引入专用可微计数器引导有损压缩优化,使其为后续无损压缩提供更优的切入口。此外,该方法可灵活控制全局目标压缩比,并为不同层自适应分配压缩参数。最终,该方法能在短时间内在不牺牲精度的情况下实现稳定10倍压缩比,或通过轻微精度损失达到20倍压缩比。代码开源于 https://github.com/ModelTC/L2_Compression 。