Quantization is a widely used compression method that effectively reduces redundancies in over-parameterized neural networks. However, existing quantization techniques for deep neural networks often lack a comprehensive error analysis due to the presence of non-convex loss functions and nonlinear activations. In this paper, we propose a fast stochastic algorithm for quantizing the weights of fully trained neural networks. Our approach leverages a greedy path-following mechanism in combination with a stochastic quantizer. Its computational complexity scales only linearly with the number of weights in the network, thereby enabling the efficient quantization of large networks. Importantly, we establish, for the first time, full-network error bounds, under an infinite alphabet condition and minimal assumptions on the weights and input data. As an application of this result, we prove that when quantizing a multi-layer network having Gaussian weights, the relative square quantization error exhibits a linear decay as the degree of over-parametrization increases. Furthermore, we demonstrate that it is possible to achieve error bounds equivalent to those obtained in the infinite alphabet case, using on the order of a mere $\log\log N$ bits per weight, where $N$ represents the largest number of neurons in a layer.
翻译:量化是一种广泛使用的压缩方法,能有效减少过参数化神经网络中的冗余。然而,由于非凸损失函数和非线性激活函数的存在,现有的深度神经网络量化技术往往缺乏全面的误差分析。在本文中,我们提出了一种用于量化完全训练后神经网络权重的快速随机算法。该方法结合了贪心路径跟踪机制与随机量化器,其计算复杂度仅与网络中权重数量呈线性关系,从而能高效量化大型网络。重要的是,在无限字母表条件下且对权重与输入数据做出最小假设的前提下,我们首次建立了全局网络误差界。作为该结果的一个应用,我们证明:当量化具有高斯权重的多层网络时,相对平方量化误差会随着过参数化程度的增加而线性衰减。此外,我们证明,仅使用每权重约$\log\log N$比特(其中$N$代表一层中最大神经元数量),即可实现与无限字母表情形相当的误差界。