Post-training quantization (PTQ) is a popular method for compressing deep neural networks (DNNs) without modifying their original architecture or training procedures. Despite its effectiveness and convenience, the reliability of PTQ methods in the presence of some extrem cases such as distribution shift and data noise remains largely unexplored. This paper first investigates this problem on various commonly-used PTQ methods. We aim to answer several research questions related to the influence of calibration set distribution variations, calibration paradigm selection, and data augmentation or sampling strategies on PTQ reliability. A systematic evaluation process is conducted across a wide range of tasks and commonly-used PTQ paradigms. The results show that most existing PTQ methods are not reliable enough in term of the worst-case group performance, highlighting the need for more robust methods. Our findings provide insights for developing PTQ methods that can effectively handle distribution shift scenarios and enable the deployment of quantized DNNs in real-world applications.
翻译:后训练量化(PTQ)是一种在不改变深度神经网络(DNN)原始架构或训练流程的情况下对其进行压缩的流行方法。尽管PTQ方法有效且便捷,但在分布偏移和数据噪声等极端情况下的可靠性问题仍未得到充分研究。本文首先针对多种常用PTQ方法探究该问题,旨在回答与校准集分布变化、校准范式选择及数据增强或采样策略对PTQ可靠性影响相关的若干研究问题。我们系统评估了广泛任务和常用PTQ范式,结果表明:大多数现有PTQ方法在最差组性能方面可靠性不足,这凸显了开发更鲁棒方法的必要性。我们的发现为开发能有效处理分布偏移场景的PTQ方法提供了见解,并促进了量化DNN在实际应用中的部署。