The expanding model size and computation of deep neural networks (DNNs) have increased the demand for efficient model deployment methods. Quantization-aware training (QAT) is a representative model compression method to leverage redundancy in weights and activations. However, most existing QAT methods require end-to-end training on the entire dataset, which suffers from long training time and high energy costs. Coreset selection, aiming to improve data efficiency utilizing the redundancy of training data, has also been widely used for efficient training. In this work, we propose a new angle through the coreset selection to improve the training efficiency of quantization-aware training. Based on the characteristics of QAT, we propose two metrics: error vector score and disagreement score, to quantify the importance of each sample during training. Guided by these two metrics of importance, we proposed a quantization-aware adaptive coreset selection (ACS) method to select the data for the current training epoch. We evaluate our method on various networks (ResNet-18, MobileNetV2), datasets(CIFAR-100, ImageNet-1K), and under different quantization settings. Compared with previous coreset selection methods, our method significantly improves QAT performance with different dataset fractions. Our method can achieve an accuracy of 68.39% of 4-bit quantized ResNet-18 on the ImageNet-1K dataset with only a 10% subset, which has an absolute gain of 4.24% compared to the baseline.
翻译:深度神经网络(DNNs)规模的扩大和计算量的增加,推动了对高效模型部署方法的需求。量化感知训练(QAT)是一种代表性的模型压缩方法,通过利用权重和激活中的冗余来提升效率。然而,现有大多数QAT方法需要对整个数据集进行端到端训练,导致训练时间长、能耗高。核心集选择旨在利用训练数据的冗余提高数据效率,已被广泛用于高效训练。本文从核心集选择的新角度出发,提升量化感知训练的训练效率。基于QAT的特性,我们提出两种度量指标:误差向量分数和分歧分数,以量化训练过程中每个样本的重要性。受这两种重要性度量的引导,我们提出了一种量化感知的自适应核心集选择(ACS)方法,用于选择当前训练轮次的数据。我们在多种网络(ResNet-18、MobileNetV2)、数据集(CIFAR-100、ImageNet-1K)以及不同量化设置下评估了该方法。与以往的核心集选择方法相比,我们的方法在不同数据子集比例下显著提升了QAT性能。例如,在ImageNet-1K数据集上,仅使用10%子集训练4比特量化的ResNet-18,我们方法达到68.39%的准确率,较基线方法绝对提升4.24%。