While vision transformers (ViTs) have shown great potential in computer vision tasks, their intense computation and memory requirements pose challenges for practical applications. Existing post-training quantization methods leverage value redistribution or specialized quantizers to address the non-normal distribution in ViTs. However, without considering the asymmetry in activations and relying on hand-crafted settings, these methods often struggle to maintain performance under low-bit quantization. To overcome these challenges, we introduce SmoothQuant with bias term (SQ-b) to alleviate the asymmetry issue and reduce the clamping loss. We also introduce optimal scaling factor ratio search (OPT-m) to determine quantization parameters by a data-dependent mechanism automatically. To further enhance the compressibility, we incorporate the above-mentioned techniques and propose a mixed-precision post-training quantization framework for vision transformers (MPTQ-ViT). We develop greedy mixed-precision quantization (Greedy MP) to allocate layer-wise bit-width considering both model performance and compressibility. Our experiments on ViT, DeiT, and Swin demonstrate significant accuracy improvements compared with SOTA on the ImageNet dataset. Specifically, our proposed methods achieve accuracy improvements ranging from 0.90% to 23.35% on 4-bit ViTs with single-precision and from 3.82% to 78.14% on 5-bit fully quantized ViTs with mixed-precision.
翻译:尽管视觉Transformer在计算机视觉任务中展现出巨大潜力,但其高昂的计算与存储成本对实际应用构成挑战。现有训练后量化方法通过值重分布或专用量化器来处理ViT中的非正态分布问题,然而,由于未考虑激活值的非对称性且依赖人工设定参数,这些方法在低位宽量化下往往难以维持性能。为克服上述挑战,我们引入带偏置项的SmoothQuant(SQ-b)以缓解非对称问题并减少钳位损失;同时提出最优缩放因子比率搜索方法(OPT-m),通过数据驱动机制自动确定量化参数。为进一步提升压缩性能,我们融合上述技术提出面向视觉Transformer的混合精度训练后量化框架MPTQ-ViT,并开发贪婪混合精度量化策略(Greedy MP),在兼顾模型性能与压缩能力的基础上逐层分配位宽。在ViT、DeiT和Swin上的实验表明,与ImageNet数据集上的现有最优方法相比,我们的方法在4位单精度ViT上实现了0.90%至23.35%的精度提升,在5位全量化混合精度ViT上实现了3.82%至78.14%的精度提升。