Deep learning models are the most efficient models in many machine learning tasks. The main disadvantage when using them in IoT, mobile devices, independent autonomous or real-time systems is their complexity and memory size. Therefore, much research has concentrated on compression techniques of deep learning architectures. One of the most popular technique is quantization. In most of the works, the quantization is done based on the nearest neighbour quantization technique. This work focuses on improving the quantization efficiency in pretrained and quantized models. This approach has the potential to improve the final accuracy of quantized models. The main postulate of the work is that final quantization states of the network based on nearest neighbour rounding does not guarantee optimal accuracy. In the presented work, the evolution strategy is used as an optimization approach. The evolution in each iteration changes the values of the small percentage of weights. It shifts theirs values to different quantization states. The work shows that proposed evolution with an appropriate set of operators and parameters can fast improve the accuracy of the quantized models. The results are presented for popular architectures such as VGG and Resnet for image classification and detection. Additionally, simulations were carried out for the autoencoder architecture.
翻译:深度学习模型在许多机器学习任务中是最有效的模型。然而,将其应用于物联网、移动设备、独立自主系统或实时系统时,主要缺点在于其复杂性和内存占用。因此,大量研究集中于深度学习架构的压缩技术。最流行的技术之一是量化。在大多数工作中,量化基于最近邻量化技术进行。本研究聚焦于提升预训练及已量化模型的量化效率。该方法具有提升量化模型最终精度的潜力。工作的主要前提是:基于最近邻舍入的最终网络量化状态无法保证最优精度。在本工作中,进化策略被用作优化方法。每次迭代中,进化过程改变一小部分权重的值,将其转移到不同的量化状态。研究表明,所提出的进化方法配合适当的算子与参数集,能够快速提升量化模型的精度。实验结果展示了在图像分类与检测任务中,面向VGG和ResNet等流行架构的表现。此外,还针对自动编码器架构进行了仿真模拟。