Quantization-aware training (QAT) is a representative model compression method to reduce redundancy in weights and activations. However, most existing QAT methods require end-to-end training on the entire dataset, which suffers from long training time and high energy costs. In addition, the potential label noise in the training data undermines the robustness of QAT. We propose two metrics based on analysis of loss and gradient of quantized weights: error vector score and disagreement score, to quantify the importance of each sample during training. Guided by these two metrics, we proposed a quantization-aware Adaptive Coreset Selection (ACS) method to select the data for the current training epoch. We evaluate our method on various networks (ResNet-18, MobileNetV2, RetinaNet), datasets(CIFAR-10, CIFAR-100, ImageNet-1K, COCO), and under different quantization settings. Specifically, our method can achieve an accuracy of 68.39\% of 4-bit quantized ResNet-18 on the ImageNet-1K dataset with only a 10\% subset, which has an absolute gain of 4.24\% compared to the baseline. Our method can also improve the robustness of QAT by removing noisy samples in the training set.
翻译:量化感知训练(QAT)是一种代表性的模型压缩方法,用于减少权重和激活中的冗余。然而,现有的大多数QAT方法需要在完整数据集上进行端到端训练,这存在训练时间长、能耗高的问题。此外,训练数据中潜在的标签噪声会削弱QAT的鲁棒性。我们基于对量化权重损失和梯度的分析,提出了两个度量指标:误差向量分数和分歧分数,以量化每个训练样本的重要性。在这两个指标的指导下,我们提出了一种量化感知的自适应核心集选择(ACS)方法,为当前训练周期选择数据。我们在多种网络(ResNet-18、MobileNetV2、RetinaNet)、数据集(CIFAR-10、CIFAR-100、ImageNet-1K、COCO)以及不同量化设置下评估了我们的方法。具体而言,我们的方法在ImageNet-1K数据集上,仅使用10%的数据子集,就能使4位量化ResNet-18达到68.39%的准确率,相比基线绝对提升了4.24%。我们的方法还能通过移除训练集中的噪声样本来提升QAT的鲁棒性。