Modern neural networks have revolutionized the fields of computer vision (CV) and Natural Language Processing (NLP). They are widely used for solving complex CV tasks and NLP tasks such as image classification, image generation, and machine translation. Most state-of-the-art neural networks are over-parameterized and require a high computational cost. One straightforward solution is to replace the layers of the networks with their low-rank tensor approximations using different tensor decomposition methods. This paper reviews six tensor decomposition methods and illustrates their ability to compress model parameters of convolutional neural networks (CNNs), recurrent neural networks (RNNs) and Transformers. The accuracy of some compressed models can be higher than the original versions. Evaluations indicate that tensor decompositions can achieve significant reductions in model size, run-time and energy consumption, and are well suited for implementing neural networks on edge devices.
翻译:现代神经网络彻底改变了计算机视觉(CV)和自然语言处理(NLP)领域。它们被广泛应用于解决图像分类、图像生成和机器翻译等复杂CV和NLP任务。大多数最先进的神经网络都存在过参数化问题,且计算成本高昂。一种直接的解决方案是利用不同的张量分解方法,将网络层替换为其低秩张量近似。本文综述了六种张量分解方法,并阐述了它们在压缩卷积神经网络(CNN)、循环神经网络(RNN)和Transformer模型参数方面的能力。部分压缩模型的精度甚至可超过原始版本。评估表明,张量分解能够显著缩减模型尺寸、运行时功耗和能耗,非常适合在边缘设备上部署神经网络。