Efficient Quantization-aware Training with Adaptive Coreset Selection

The expanding model size and computation of deep neural networks (DNNs) have increased the demand for efficient model deployment methods. Quantization-aware training (QAT) is a representative model compression method to leverage redundancy in weights and activations. However, most existing QAT methods require end-to-end training on the entire dataset, which suffers from long training time and high energy costs. Coreset selection, aiming to improve data efficiency utilizing the redundancy of training data, has also been widely used for efficient training. In this work, we propose a new angle through the coreset selection to improve the training efficiency of quantization-aware training. Based on the characteristics of QAT, we propose two metrics: error vector score and disagreement score, to quantify the importance of each sample during training. Guided by these two metrics of importance, we proposed a quantization-aware adaptive coreset selection (ACS) method to select the data for the current training epoch. We evaluate our method on various networks (ResNet-18, MobileNetV2), datasets(CIFAR-100, ImageNet-1K), and under different quantization settings. Compared with previous coreset selection methods, our method significantly improves QAT performance with different dataset fractions. Our method can achieve an accuracy of 68.39% of 4-bit quantized ResNet-18 on the ImageNet-1K dataset with only a 10% subset, which has an absolute gain of 4.24% compared to the baseline.

翻译：深度神经网络（DNNs）规模的扩大和计算量的增加，推动了对高效模型部署方法的需求。量化感知训练（QAT）是一种代表性的模型压缩方法，通过利用权重和激活中的冗余来提升效率。然而，现有大多数QAT方法需要对整个数据集进行端到端训练，导致训练时间长、能耗高。核心集选择旨在利用训练数据的冗余提高数据效率，已被广泛用于高效训练。本文从核心集选择的新角度出发，提升量化感知训练的训练效率。基于QAT的特性，我们提出两种度量指标：误差向量分数和分歧分数，以量化训练过程中每个样本的重要性。受这两种重要性度量的引导，我们提出了一种量化感知的自适应核心集选择（ACS）方法，用于选择当前训练轮次的数据。我们在多种网络（ResNet-18、MobileNetV2）、数据集（CIFAR-100、ImageNet-1K）以及不同量化设置下评估了该方法。与以往的核心集选择方法相比，我们的方法在不同数据子集比例下显著提升了QAT性能。例如，在ImageNet-1K数据集上，仅使用10%子集训练4比特量化的ResNet-18，我们方法达到68.39%的准确率，较基线方法绝对提升4.24%。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日