This paper examines the quantization methods used in large-scale data analysis models and their hyperparameter choices. The recent surge in data analysis scale has significantly increased computational resource requirements. To address this, quantizing model weights has become a prevalent practice in data analysis applications such as deep learning. Quantization is particularly vital for deploying large models on devices with limited computational resources. However, the selection of quantization hyperparameters, like the number of bits and value range for weight quantization, remains an underexplored area. In this study, we employ the typical case analysis from statistical physics, specifically the replica method, to explore the impact of hyperparameters on the quantization of simple learning models. Our analysis yields three key findings: (i) an unstable hyperparameter phase, known as replica symmetry breaking, occurs with a small number of bits and a large quantization width; (ii) there is an optimal quantization width that minimizes error; and (iii) quantization delays the onset of overparameterization, helping to mitigate overfitting as indicated by the double descent phenomenon. We also discover that non-uniform quantization can enhance stability. Additionally, we develop an approximate message-passing algorithm to validate our theoretical results.
翻译:本文通过典型情况分析,研究了大规模数据分析模型中使用的量化方法及其超参数选择。近年来数据分析规模的激增显著增加了计算资源需求。为解决这一问题,量化模型权重已成为深度学习等数据分析应用中的普遍做法。对于在计算资源有限的设备上部署大型模型而言,量化尤为关键。然而,量化超参数(如权重量化的比特数和取值范围)的选择仍是一个研究不足的领域。本研究采用统计物理学中的典型情况分析(特别是副本方法),探讨了超参数对简单学习模型量化的影响。我们的分析得出三个关键结论:(i)当比特数较少且量化宽度较大时,会出现不稳定的超参数相,即副本对称破缺;(ii)存在一个使误差最小化的最优量化宽度;(iii)量化延缓了过参数化的发生,有助于减轻过拟合(如双下降现象所示)。我们还发现非均匀量化可以增强稳定性。此外,我们开发了一种近似消息传递算法来验证理论结果。