Today, deep neural networks are widely used since they can handle a variety of complex tasks. Their generality makes them very powerful tools in modern technology. However, deep neural networks are often overparameterized. The usage of these large models consumes a lot of computation resources. In this paper, we introduce a method called \textbf{T}ill the \textbf{L}ayers \textbf{C}ollapse (TLC), which compresses deep neural networks through the lenses of batch normalization layers. By reducing the depth of these networks, our method decreases deep neural networks' computational requirements and overall latency. We validate our method on popular models such as Swin-T, MobileNet-V2, and RoBERTa, across both image classification and natural language processing (NLP) tasks.
翻译:如今,深度神经网络因其能够处理各种复杂任务而被广泛应用。其通用性使其成为现代技术中非常强大的工具。然而,深度神经网络通常存在过参数化问题。使用这些大型模型会消耗大量计算资源。本文提出一种名为**层叠坍塌**(TLC)的方法,该方法通过批归一化层的视角压缩深度神经网络。通过减少这些网络的深度,我们的方法降低了深度神经网络的计算需求和整体延迟。我们在Swin-T、MobileNet-V2和RoBERTa等流行模型上,针对图像分类和自然语言处理(NLP)任务验证了该方法的有效性。