Tensor decompositions have been successfully applied to compress neural networks. The compression algorithms using tensor decompositions commonly minimize the approximation error on the weights. Recent work assumes the approximation error on the weights is a proxy for the performance of the model to compress multiple layers and fine-tune the compressed model. Surprisingly, little research has systematically evaluated which approximation errors can be used to make choices regarding the layer, tensor decomposition method, and level of compression. To close this gap, we perform an experimental study to test if this assumption holds across different layers and types of decompositions, and what the effect of fine-tuning is. We include the approximation error on the features resulting from a compressed layer in our analysis to test if this provides a better proxy, as it explicitly takes the data into account. We find the approximation error on the weights has a positive correlation with the performance error, before as well as after fine-tuning. Basing the approximation error on the features does not improve the correlation significantly. While scaling the approximation error commonly is used to account for the different sizes of layers, the average correlation across layers is smaller than across all choices (i.e. layers, decompositions, and level of compression) before fine-tuning. When calculating the correlation across the different decompositions, the average rank correlation is larger than across all choices. This means multiple decompositions can be considered for compression and the approximation error can be used to choose between them.
翻译:张量分解已成功应用于神经网络压缩。基于张量分解的压缩算法通常最小化权重的近似误差。近期研究假设权重近似误差可作为压缩多层模型性能的代理指标,并用于微调压缩模型。令人惊讶的是,鲜有研究系统评估哪些近似误差可用于选择层结构、张量分解方法及压缩程度。为填补这一空白,我们通过实验研究检验该假设在不同层与分解类型中的适用性,并分析微调的效果。我们将压缩层产生的特征近似误差纳入分析,以检验其能否提供更优代理指标(因其明确考虑了数据因素)。研究发现:权重近似误差与性能误差在微调前后均呈正相关;基于特征的近似误差并未显著提升相关性。虽然常通过缩放近似误差来适应不同层的大小,但在微调前,层间平均相关性低于所有选择(即层、分解方法与压缩程度)的整体相关性。当计算不同分解方法间的相关性时,平均秩相关性高于整体相关性。这意味着多种分解方法均可用于压缩,且近似误差可用于指导方法选择。