Modern deep neural networks (DNNs) are extremely powerful; however, this comes at the price of increased depth and having more parameters per layer, making their training and inference more computationally challenging. In an attempt to address this key limitation, efforts have been devoted to the compression (e.g., sparsification and/or quantization) of these large-scale machine learning models, so that they can be deployed on low-power IoT devices. In this paper, building upon recent advances in neural tangent kernel (NTK) and random matrix theory (RMT), we provide a novel compression approach to wide and fully-connected \emph{deep} neural nets. Specifically, we demonstrate that in the high-dimensional regime where the number of data points $n$ and their dimension $p$ are both large, and under a Gaussian mixture model for the data, there exists \emph{asymptotic spectral equivalence} between the NTK matrices for a large family of DNN models. This theoretical result enables "lossless" compression of a given DNN to be performed, in the sense that the compressed network yields asymptotically the same NTK as the original (dense and unquantized) network, with its weights and activations taking values \emph{only} in $\{ 0, \pm 1 \}$ up to a scaling. Experiments on both synthetic and real-world data are conducted to support the advantages of the proposed compression scheme, with code available at \url{https://github.com/Model-Compression/Lossless_Compression}.
翻译:现代深度神经网络(DNN)功能极其强大;然而,这以增加深度和每层参数数量为代价,使其训练和推理在计算上更具挑战性。为应对这一关键限制,研究者致力于对这些大规模机器学习模型进行压缩(例如稀疏化和/或量化),以便将其部署在低功耗物联网设备上。本文基于神经正切核(NTK)和随机矩阵理论(RMT)的最新进展,提出了一种针对宽全连接深度神经网络的新型压缩方法。具体而言,我们证明在数据点数量n及其维度p均较大的高维场景下,且数据服从高斯混合模型时,对于一大类深度神经网络模型,其NTK矩阵之间存在渐近谱等价性。这一理论结果使得对给定DNN进行“无损”压缩成为可能——即压缩后的网络与原始(稠密且未量化)网络具有渐近相同的NTK,且其权重和激活值在缩放后仅取自{0, ±1}。我们通过合成数据和真实世界数据的实验验证了所提出压缩方案的优势,代码已开源在https://github.com/Model-Compression/Lossless_Compression。